SCHAUM'S OUTLINE SERIES
THEORV AND PROBLEMS OF
SCHAVM'S OUTLINE OF
THEORY AXD PROBLEMS
OF
v
LINEAR
ALGEBRA
•ed
BY
SEYMOUR LIPSCHUTZ, Ph.D.
Associate Professor of Mathematics
Temple University
SCHAIJM'S OIJTUl^E SERIES
McGRAWHILL BOOK COMPANY
New York, St. Louis, San Francisco, Toronto, Sydney
Copyright © 1968 by McGrawHill, Inc. All Rights Reserved. Printed in the
United States of America. No part of this publication may be reproduced,
stored in a retrieval system, or transmitted, in any form or by any means,
electronic, mechanical, photocopying, recording, or otherwise, without the
prior written permission of the publisher.
37989
8910 8HSH 754321
liX)mOM ^Q^fv^oiA
Preface
Linear algebra has in recent years become an essential part of the mathematical
background required of mathematicians, engineers, physicists and other scientists.
This requirement reflects the importance and wide applications of the subject matter.
This book is designed for use as a textbook for a formal course in linear algebra
or as a supplement to all current standard texts. It aims to present an introduction to
linear algebra which will be found helpful to all readers regardless of their fields of
specialization. More material has been included than can be covered in most first
courses. This has been done to make the book more flexible, to provide a useful book
of reference, and to stimulate further interest in the subject.
Each chapter begins with clear statements of pertinent definitions, principles and
theorems together with illustrative and other descriptive material. This is followed
by graded sets of solved and supplementary problems. The solved problems serve to
illustrate and amplify the theory, bring into sharp focus those fine points without
which the student continually feels himself on unsafe ground, and provide the repetition
of basic principles so vital to effective learning. Numerous proofs of theorems are
included among the solved problems. The supplementary problems serve as a complete
review of the material of each chapter.
The first three chapters treat of vectors in Euclidean space, linear equations and
matrices. These provide the motivation and basic computational tools for the abstract
treatment of vector spaces and linear mappings which follow. A chapter on eigen
values and eigenvectors, preceded by determinants, gives conditions for representing
a linear operator by a diagonal matrix. This naturally leads to the study of various
canonical forms, specifically the triangular, Jordan and rational canonical forms.
In the last chapter, on inner product spaces, the spectral theorem for symmetric op
erators is obtained and is applied to the diagonalization of real quadratic forms. For
completeness, the appendices include sections on sets and relations, algebraic structures
and polynomials over a field.
I wish to thank many friends and colleagues, especially Dr. Martin Silverstein and
Dr. Hwa Tsang, for invaluable suggestions and critical review of the manuscript.
I also want to express my gratitude to Daniel Schaum and Nicola Monti for their very
helpful cooperation.
Seymour Lipschutz
Temple University
January, 1968
CONTENTS
Chapter 1 VECTORS IN R" AND C"
Introduction. Vectors in R«. Vector addition and scalar multiplication. Dot
product. Norm and distance in R". Complex numbers. Vectors in C«.
Page
1
Chapter 2 LINEAR EQUATIONS 18
Introduction. Linear equation. System of linear equations. Solution of a sys
tem of linear equations. Solution of a homogeneous system of linear equations.
Chapter 3 MATRICES 35
Introduction. Matrices. Matrix addition and scalar multiplication. Matrix
multiplication. Transpose. Matrices and systems of linear equations. Echelon
matrices. Row equivalence and elementary row operations. Square matrices.
Algebra of square matrices. Invertible matrices. Block matrices.
Chapter 4 VECTOR SPACES AND SUBSPACES
Introduction. Examples of vector spaces. Subspaces. Linear combinations,
linear spans. Row space of a matrix. Sums and direct sums.
63
Chapter 5 BASIS AND DIMENSION 86
Introduction. Linear dependence. Basis and dimension. Dimension and sub
spaces. Rank of a matrix. Applications to linear equations. Coordinates.
Chapter B LINEAR MAPPINGS 121
Mappings. Linear mappings. Kernel and image of a linear mapping. Singular
and nonsingular mappings. Linear mappings and systems of linear equations.
Operations with linear mappings. Algebra of linear operators. Invertible
operators.
Chapter 7 MATRICES AND LINEAR OPERATORS 150
Introduction. Matrix representation of a linear operator. Change of basis.
Similarity. Matrices and linear mappings.
Chapter 8 DETERMINANTS 171
Introduction. Permutations. Determinant. Properties of determinants. Mi
nors and cofactors. Classical adjoint. Applications to linear equations. Deter
minant of a linear operator. Multilinearity and determinants.
Chapter 9 EIGENVALUES AND EIGENVECTORS 197
Introduction. Polynomials of matrices and linear operators. Eigenvalues and
eigenvectors. Diagonalization and eigenvectors. Characteristio polynomial,
CayleyHamilton theorem. Minimum polynomial. Characteristic and minimum
polynomials of linear operators.
CONTENTS
Page
Chapter 10 CANONICAL FORMS 222
Introduction. Triangular form. Invariance. Invariant directsum decom
positions. Primary decomposition. Nilpotent operators, Jordan canonical
form. Cyclic subspaces. Rational canonical form. Quotient spaces.
Chapter 11 LINEAR FUNCTION ALS AND THE DUAL SPACE 249
Introduction. Linear functionals and the dual space. Dual basis. Second dual
space. Annihilators. Transpose of a linear mapping.
Chapter 12 BILINEAR, QUADRATIC AND HERMITIAN FORMS 261
Bilinear forms. Bilinear forms and matrices. Alternating bilinear forms.
Symmetric bilinear forms, quadratic forms. Real symmetric bilinear forms.
Law of inertia. Hermitian forms.
Chapter IB INNER PRODUCT SPACES 279
Introduction. Inner product spaces. CauchySchwarz inequality. Orthogo
nality. Orthonormal sets. GramSchmidt orthogonalization process. Linear
functionals and adjoint operators. Analogy between A(V) and C, special
operators. Orthogonal and unitary operators. Orthogonal and unitary mat
rices. Change of orthonormal basis. Positive operators. Diagonalization and
canonical forms in Euclidean spaces. Diagonalization and canonical forms in
unitary spaces. Spectral theorem.
Appendix A SETS AND RELATIONS 315
Sets, elements. Set operations. Product sets. Relations. Equivalence
relations.
Appendix B ALGEBRAIC STRUCTURES 320
Introduction. Groups. Rings, integral domains and fields. Modules.
AppendixC POLYNOMIALS OVER A FIELD 327
Introduction. Ring of polynomials. Notation. Divisibility. Factorization.
INDEX 331
chapter 1
Vectors in R^ and C
INTRODUCTION
In various physical applications there appear certain quantities, such as temperature
and speed, which possess only "magnitude". These can be represented by real numbers and
are called scalars. On the other hand, there are also quantities, such as force and velocity,
which possess both "magnitude" and "direction". These quantities can be represented by
arrows (having appropriate lengths and directions and emanating from some given ref
erence point O) and are called vectors. In this chapter we study the properties of such
vectors in some detail.
We begin by considering the following operations on vectors.
(i)
(ii)
Addition: The resultant u + v of two vectors u
and V is obtained by the socalled parallelogram
law, i.e. u + V is the diagonal of the parallelogram
formed by u and v as shown on the right.
Scalar multiplication: The product kn of a real
number fc by a vector u is obtained by multiplying
the magnitude of u by A; and retaining the same
direction if k^O or the opposite direction if
k<0, as shown on the right.
Now we assume the reader is familiar with the representation of the points in the plane
by ordered pairs of real numbers. If the origin of the axes is chosen at the reference point
above, then every vector is uniquely determined by the coordinates of its endpoint. The
relationship between the above operations and endpoints follows.
(i) Addition: If (a, &) and (c, d) are the endpoints of the vectors u and v, then (a + c, b + d)
will be the endpoint of u + v, as shown in Fig. (a) below.
(a + c, b + d)
(ka, kb)
Fig. (a)
Fig. (6)
(ii) Scalar multiplication: If (a, b) is the endpoint of the vector u, then {ka, kb) will be the
endpoint of the vector kn, as shown in Fig. (6) above.
2 VECTORS IN B" AND C» [CHAP. 1
Mathematically, we identify a vector with its endpoint; that is, we call the ordered pair
(a, 6) of real numbers a vector. In fact, we shall generalize this notion and call an «tuple
{ai, C2, . . . , a«) of real numbers a vector. We shall again generalize and permit the co
ordinates of the «tuple to be complex numbers and not just real numbers. Furthermore, in
Chapter 4, we shall abstract properties of these %tuples and formally define the mathe
matical system called a vector space.
We assume the reader is familiar with the elementary properties of the real number
field which we denote by R.
VECTORS IN R"
The set of all wtuples of real numbers, denoted by R", is called nspace. A particular
«tuple in R", say
U — (til, Uz, . . ., Un)
is called a point or vector; the real numbers im are called the components (or: coordinates)
of the vector u. Moreover, when discussing the space R" we use the term scalar for the
elements of R, i.e. for the real numbers.
Example 1.1: Consider the following vectors:
(0,1), (1,3), (1, 2, VS, 4), (5, 1, 0,ff)
The first two vectors have two components and so are points in B^; the last two
vectors have four components and so are points in B*.
Two vectors u and v are eqtial, written u = v, if they have the same number of com
ponents, i.e. belong to the same space, and if corresponding components are equal. The
vectors (1, 2, 3) and (2, 3, 1) are not equal, since corresponding elements are not equal.
Example 1.2: Suppose (xy, x + y, z1) = (4, 2, 3). Then, by definition of equality of vectors,
X — y = 4:
x + y = 2
21 = 3
Solving the above system of equations gives x = 3, y = —1, and z — A.
VECTOR ADDITION AND SCALAR MULTIPLICATION
Let u and v be vectors in R":
u = (Ml, U2, . . . , Un) and v = {Vi, Vz, ■■ ., Vn)
The sum of u and v, written u + v,is the vector obtained by adding corresponding components:
U + V = iUi + Vi,U2 + V2, . ■ .,Un + Vn)
The product of a real number fc by the vector u, written ku, is the vector obtained by multi
plying each component of u by k:
ku — (kui, ku2, . . . , kun)
Observe that u + v and ku are also vectors in R". We also define
u = 1m and u v — m + {v)
The sum of vectors with different numbers of components is not defined.
CHAP. 1] VECTORS IN K« AND C" 3
Example 1.3: Let u = (1,3,2,4) and v = (3,5,1,2). Then
u + V  (1 + 3, 3 + 5, 2  1, 4  2)  (4, 2, 1, 2)
5w = (5 • 1, 5 • (3), 5 • 2, 5 • 4) ^ (5, 15, 10, 20)
2m  30) = (2, 6, 4, 8) + (9, 15, 3, 6) = (7, 21, 7, 14)
Example 1.4: The vector (0, 0, . . ., 0) in P.", denoted by 0, is called the zero vector. It is similar
to the scalar in that, for any vector u = (ztj, %, . . . , u„),
U + = (Ml + 0, M2 + 0, • ■ • , M„ + 0) = (Ml, 2*2 uj = «
Basic properties of the vectors in R" under the operations of vector addition and scalar
multiplication are described in the following theorem.
Theorem 1.1: For any vectors u,v,w G R" and any scalars k, k' G R:
(i) (u + v) + w = u + {v + w) (v) k(u + v)  ku + kv
(ii) u + — u (vi) (ft + k')u =■ ku + k'u
(iii) u + {u) = (vii) (kk')u = k{k'u)
(iv) u + v — V + u (viii) Vu, — u
Remark: Suppose u and v are vectors in R" for which u — kv for some nonzero scalar
A; e R. Then u is said to be in the same direction as v if fe > 0, and in the op
posite direction if k <0.
DOT PRODUCT
Let u and v be vectors in R":
u = (ui, Ui, . . . , t(„) and v = (vi, Vz, . . ., Vn)
The dot or inner product of u and v, denoted hy wv, is the scalar obtained by multiplying
corresponding components and adding the resulting products:
U'V = UiVi + U^Vi, + • • • + UnVn
The vectors u and v are said to be orthogonal (or: perpendicular) if their dot product is
zero: m • v = 0.
Example 15: Let m = (1, 2, 3, 4), y = (6, 7, 1, 2) and w = (5, 4, 5, 7). Then
uv = 16 + (2)'7 + 31 + (4)'(2) = 614 + 3 + 8 = 3
M'W = 1'5 + (2) •(4) + 35 + (4) 7 = 5 + 8 + 1528 =
Thus u and w are orthogonal.
Basic properties of the dot product in R" follow.
Theorem 1.2: For any vectors u.v.w G R" and any scalar fc € R:
(i) {u + v)"W =^ uw + vw (iii) wv = vu
(ii) {ku) ' V = k{u • v) (iv) uu^O, and wm = iff ud
Remark: The space R" with the above operations of vector addition, scalar multiplication
and dot product is usually called Euclidean nspace.
NORM AND DISTANCE IN R»
Let u and v be vectors in R": u = (uuUz, .. .,Vm) and v = (vi,V2, . . .,Vn). The dis
tance between the points m and v, written d{u, v), is defined by
d(U,V) = \/(«l  '^i? + {U2V2)^+ ■■■ +(Un Vn)'''
VECTORS IN K" AND C«
[CHAP. 1
The norm (or: length) of the vector u, written m, is defined to be the nonnegative square
root ot U'u: .
\\u\\ = y/u'U = yul + ul+ • • • +ul
By Theorem 1.2, wu^O and so the square root exists. Observe that
d{u,v) — \\u — v\\
Example 1.6: Let m = (1,2, 4,1) and v = (3, 1, 5, 0). Then
d(u,v) = V(l  3)2 + (2  1)2 + (4 + 5)2 + (1  0)2 = ^95
\\v\\  V32 + 12 + (5)2 + 02 = V35
Now if we consider two points, say p — (a, b) and q = (c, d) in the plane R2, then
pl = Vo^TF and d{p,q) = V(a  c)" + (&  <i)'
That is, p corresponds to the usual Euclidean length of the arrow from the origin to the
point p, and d{p, q) corresponds to the usual Euclidean distance between the points p and
q, as shown below:
P = (a, b)
1~
id
(a, 6)
1 9 = (c, d)
I
H 1«  e\ — •\
A similar result holds for points on the line R and in space R*.
Remark: A vector e is called a unit vector if its norm is 1: e = 1. Observe that, for
any nonzero vector m G R", the vector eu — u/\\u\\ is a unit vector in the same
direction as u.
We now state a fundamental relationship known as the CauchySchwarz inequality.
Theorem 1.3 (CauchySchwarz): For any vectors u,v G R", \uv\^ \\u\\ \\v\\.
Using the above inequality, we can now define the angle 6 between any two nonzero
vectors u,v GW by ^ , ^
cos 6 
U V
Note that if uv = 0, then 9
definition of orthogonality.
90° (or: 6 = ir/2). This then agrees with our previous
COMPLEX NUMBERS
The set of complex numbers is denoted by C. Formally, a complex number is an
ordered pair (a, b) of real numbers; equality, addition and multiplication of complex num
bers are defined as follows:
(a, b) = (c, d) iff a c and b = d
{a,b) + {c,d) = {a + c,b + d)
(a, b)(c, d) = {ac  bd, ad + be)
CHAP. 1]
VECTORS IN K« AND C"
We identify the real number a with the complex number (a, 0):
a <> (a, 0)
This is possible since the operations of addition and multiplication of real numbers are
preserved under the correspondence:
(a, 0) + (b, 0) = (a + b, 0) and (a, 0)(6, 0) = {ab, 0)
Thus we view E as a subset of C and replace (a, 0) by o whenever convenient and possible.
The complex number (0, 1), denoted by i, has the important property that
42 = a = (0, 1)(0, 1) = (1, 0) = 1 or i = \/=l
Furthermore, using the fact
(a, 6) = (a, 0) + (0,b) and (0,6) = (b, 0)(0, 1)
we have (a, 6) = (a, 0) + (6, 0)(0, 1) = a + bi
The notation a + bi is more convenient than (a, b). For example, the sum and product of
complex numbers can be obtained by simply using the commutative and distributive laws
9.71(1 7,^ ^ 1*
{a + bi) + (c + di)  a + c + bi + di  {a + c) + {b + d)i
(a + bi){c + di) = ac + bci + adi + bdi^  {ac  bd) + {be + ad)i
The conjugate of the complex number z = (a,b) = a + bi is denoted and defined by
z = a — bi
(Notice that zz = a^ + bK) If, in addition, « # 0, then the inverse z^ of z and division by
z are given by
z_ a . —b . , w
zz
Z1 =
a^ + b^
where w GC. We also define
— z = —Iz
+
a' + b^
and
= wz
Example 1.7: Suppose z ■
and w  z = w + (z)
= 2 + Si and w = 52i. Then
z + w = (2 + 3i) + (5  2t) = 2 + 5 + 3i  2t = 7 + i
zw = (2 + 3i)(5  2i) = 10 + 15t  4i  6t2 = 16 + Hi
3i
z = 2 + Si = 2
w 5 — 2i
and
w = 5  2t = 5 + 2i
2 + 3t
(5  2i)(2  3i) _ 419t _ i. _ 31 •
(2 + 3i)(23t) 13 13 13^
Just as the real numbers can be represented by the
points on a line, the complex numbers can be represented
by the points in the plane. Specifically, we let the point
(a, b) in the plane represent the complex number z = a + bi,
i.e. whose real part is a and whose imaginary part is b. The
absolute value of z, written z, is defined as the distance
from z to the origin:
\z\ = V^T&^
Note that z is equal to the norm of the vector (a, 6). Also, \z
Example 1.8: Suppose z2 + 3i and w = 12  5i. Then
1^1 = V4 + 9 = v'iS and wl
ZZ.
= V144 + 25 = 13
6 VECTORS IN R« AND C" [CHAP. 1
Remark: In Appendix B we define the algebraic structure called a field. We emphasize
that the set C of complex numbers with the above operations of addition and
multiplication is a field.
VECTORS IN C"
The set of all ntuples of complex numbers, denoted by C", is called complex nspace.
Just as in the real case, the elements of C" are called points or vectors, the elements of C
are called scalars, and vector addition in C" and scalar multiplication on C" are given by
(Zl, Z2, . . ., Zn) + (Wl, W2, . . ., Wn) = (^^l + Wi, Z2 + Wi, . . ., Z„ + W„)
Z(2l, 22, . . .,Zn) = {ZZi, 222, . . . , ZZn)
where Zi, wi, z G C.
Example 1.9: (2 + 3i, 4i, 3) + (3 2i, 5i 4  6i) = (5 + t, 4 + 4i, 7  6t)
2i(2 + 3i, 4  i, 3) = (6 + 4i, 2 + 8i, 6i)
Now let u and v be arbitrary vectors in C":
U — (2i, 22, . . . , Zn), V = {Wi, Wi, . . . , Wn), 2;, Wi G C
The dot, or inner, product of u and v is defined as follows:
U'V = ZiWl + Z%W% + • • • + ZnWn
Note that this definition reduces to the previous one in the real case, since Wi = Wi when
Wi is real. The norm of u is defined by
m = yJU'U = \/ZiZi + 2222 + • • • + 2„2„ = V'2ip + 22p + • • • + 2„2
Observe that wu and so « are real and positive when u¥=0, and when u = 0.
Example 1.10: Let m = (2 + 3i, 4i, 2i) and v = {S 2i, 5, 4 61). Then
uv = (2 + 3i)(3  2i) + (4  iXS) + (2i)(4  6i)
= (2 + 3i)(3 + 2t) + (4  1)(5) + (2i)(4 + 6t)
= 13i + 20  5t  12 + 8i = 8 + 16i
u'u = (2 + 3t)(2 + 3i) + (4  i)(4 i) + (2i)(2t)
= (2 + 3i)(2  3i) + (4  i)(4 + i) + (2i)(2i)
= 13 + 17 + 4 = 34
m  Vu'u = \/34
The space C" with the above operations of vector addition, scalar multiplication and dot
product, is called complex Euclidean nspace.
Remark: If wv were defined by uv = ziWi + ••• + ZnWn, then it is possible for
U'U0 even though u¥0, e.g. if u={l,i,0). In fact, w% may not even
be real.
CHAP. 1] VECTORS IN R« AND C» 7
Solved Problems
VECTORS IN R"
1.1. Compute: (i) (3,4,5) + (1,1,2); (ii) (1,2,3) + (4,5); (iii) 3(4,5,6);
(iv) (6,7,8).
(i) Add corresponding components: (3, 4, 5) + (1, 1, 2) = (3 + 1,4 + 1,52) = (4, 3, 3).
(ii) The stim is not defined since the vectors have different numbers of components.
(iii) Multiply each component by the scalar: —3(4, —5, —6) = (—12, 15, 18).
(iv) Multiply each component by —1: — (—6, 7, —8) = (6, —7, 8).
1.2. Let u^ (2, 7, 1), v = (3, 0, 4), w = (0, 5, 8). Find (i) 3%  4v, (ii) 2u + Zv 5w.
First perform the scalar multiplication and then the vector addition.
(i) 3u4v = 3(2, 7, 1)  4(3, 0, 4) = (6, 21, 3) + (12, 0, 16) = (18, 21, 13)
(ii) 2u + 3v5w = 2(2, 7, 1) + 3(3, 0, 4)  5(0, 5, 8)
= (4, 14, 2) + (9, 0, 12) + (0, 25, 40)
= (4  9 + 0, 14 +  25, 2 + 12 + 40) = (5, 39, 54)
1.3. Find x and y if {x, 3) = {2,x + y).
Since the two vectors are equal, the corresponding components are equal to each other:
X = 2, 3 = X + y
Substitute x = 2 into the second equation to obtain y = 1. Thus x — 2 and j/ = 1.
1.4. Find x and y if (4, y) = x(2, 3).
Multiply by the scalar x to obtain (4, y) = x{2, 3) = (2x, Zx).
Set the corresponding components equal to each other: 4 = 2x, y = 3a;.
Solve the linear equations for x and y: x = 2 and y = 6.
1.5. Find x, y and z if (2, 3, 4) = x{l, 1, 1) + y{l, 1, 0) + z(l, 0, 0).
First multiply by the scalars x, y and z and then add:
(2,3,4) = a;(l, 1, 1) + j/(l, 1, 0) + «(1, 0, 0)
= {X, X, x) + (y, y, 0) + (z, 0, 0)
= (x^y + z,x\y,x)
Now set the corresponding components equal to each other:
a; + j/ + z = 2, X \ y = —3, a; = 4
To solve the system of equations, substitute k = 4 into the second equation to obtain 4 + j/ = — 3
or 3/ = —7. Then substitute into the first equation to find z = 5. Thus x — 4, y = —7, z = 5.
1.6. Prove Theorem 1.1: For any vectors u,v,w GW and any scalars fc, fc'SR,
(i) {u + v) + w — u\ {v + w) (v) k{u + v) — ku + kv
(ii) u + — u (vi) (fc + k')u — ku + k'u
(iii) u + {u)  (vii) (kk')u  k{k'u)
(iv) u + V = V + u (viii) lu = u
Let Wj, Vj and Wj be the ith components of u, v and w, respectively.
8 VECTORS IN R" AND C» [CHAP. 1
(i) By definition, Mj + Vi is the ith component oi u + v and so (itj + Vj) + Wj is the tth component
of (u + v) + w. On the other hand, Vi + Wj is the ith component oi v + w and so % + (Vj + Wj)
is the tth component of u + (v + w). But Mj, Vj and Wj are real numbers for which the as
sociative law holds, that is,
(Ui + Vi) + Wi  Mjl(Di + Wj) for i\,...,n
Accordingly, (m + 1)) + tf = u+ {v^w) since their corresponding components are equal.
(ii) Here, = (0, 0, . . ., 0); hence
M + = (Ml, M2. • • •. Mn) + (0, 0, . . ., 0)
= (Ml + 0, Mg + 0, . . . , M„ + 0) = (Ml, M2, . . . , M„) = M
(iii) Since — m = — 1(mi, M2, . . . , m„) = (— Mi, — M2, . . . , — m„),
M + (— m) = (Ml, Mg, . . . , M„) + (—Ml, — M2, . . . , — M„)
= (Ml  Ml, M2  M2, . . . , M„  M„) = (0, 0, . . . , 0) =
(iv) By definition, n^ + v^ is the ith component of u + v, and Vj + Mj is the ith. component of v + u.
But Mj and ijj are real numbers for which the commutative law holds, that is,
Wi + ft = ■"( + Mi, i = 1, . . . , w
Hence m + v = 1; + m since their corresponding components are equal.
(v) Since Mj + Vj is the ith component of u + v, k(Ui + Vi) is the ith component of k(u + v). Since
fcMj and kvi are the ith components of ku and kv respectively, Ajm; + fcvj is the ith component
of ku + kv. But k, Mj and v^ are real numbers; hence
fc(Mj + Vi) = ftMj + fc^j, i = 1, . . . , n
Thus k{u + v) = ku + kv, as corresponding components are equal.
(vi) Observe that the first plus sign refers to the addition of the two scalars k and k' whereas the
second plus sign refers to the vector addition of the two vectors ku and k'u.
By definition, (fe + fc')Mj is the ith component of the vector (k + k')u. Since fcMj and Aj'Mj
are the ith components of ku and k'u respectively, kUf + k% is the ith component of ku + k'u.
But k, k' and Mj are real numbers; hence
(fc + k')Ui — kUi + k'Ui, i = 1, . . . , n
Thus (k + k')u = ku + k'u, as corresponding components are equal.
(vii) Since fc'Mj is the ith component of k'u, k(k'u^ is the ith component of fe(fe'M). But (fcfc')Mi is the
ith component of (kk')u and, since fc, k' and Mj are real numbers,
(fcfc')Mj = fc(fc'Mj), i=l, ...,n
Hence (kk')u = k(k'u), as corresponding components are equal.
(viii) 1 • M = 1(mi, M2, . . . , M„)  (1mi, lu^, .... 1m„) = (mi, M2, . . . , m„) = u.
1.7. Show that Ou = for any vector u, where clearly the first is a scalar and the second
a vector.
Method 1: Om = 0(mi, M2, . . . , m„) = (Omi, OM2, . . . , Om„) = (0, 0, ...,0) =
Method 2: By Theorem 1.1, Om = (0 + 0)m = Om + Om
Adding — Om to both sides gives us the required result.
DOT PRODUCT
1.8. Compute u • v where: (i) u = (2, 3, 6), v = (8, 2, 3); (ii) u = (1, 8, 0, 5), v = (3, 6, 4);
(iii) tt = (3,5,2,l), v (4, 1,2, 5).
(i) Multiply corresponding components and add: wv = 2 • 8 + (—3) • 2 + 6 • (—3) = —8.
(ii) The dot product is not defined between vectors with different numbers of components.
(iii) Multiply corresponding components and add: uv = 3 • 4 + (—5) • 1 + 2 • (—2) + 1 • 5 = 8.
CHAP. 1] VECTORS IN R" AND C" 9
1.9. Determine k so that the vectors u and v are orthogonal where
(i) u ^ (1, k, 3) and v = (2, 5, 4)
(ii) u = (2, 3fc, 4, 1, 5) and v = (6> 1, 3, 7, 2fc)
In each case, compute u • v, set it equal to 0, and solve for k.
(i) u'v = 1 • 2 + fc • (5) + (3) 4 = 2  5fe  12 = 0, 5k  10 == 0, fc = 2
(ii) U'V = 26 + 3fc'(l) + (4)'3 + 1*7 + 5'2fc
= 12  3fe  12 + 7 + lO/c = 0, k = l
1.10. Prove Theorem 1.2: For any vectors u,v,w G R" and any scalar kGK,
(i) {u + v)'W = U'W + vw (iii) wv = i; • tt
(ii) (fcM)^ = k{u'v) (iv) M'M  0, and uu = iff u =
Let M = (Mi,M2 O. •" = (■yi.'y2. • • •'■"n). W = (^1,^2. ■ ■ • . W„).
(i) Since u + v = (mi + Vi, M2 + ■"2. •••.**„ + I'm).
(u + v)'W = (Ml + Vi)Wi + (% + '"2)^2 + • • • + (U„ + Vn)Wn
= UiWi + ViWi + U2W2 + 1'2M'2 + • • ■ + M„W„ + V„W„
= (MiWi + M2W2 + • • • + M„w„) + (viWi + V2W2 + • • • + ■y„w„)
= U'W + VW
(ii) Since ku = (ku^, ku^, . . ., ku^),
(ku) • V = ku^Vi + fcM2'y2 + • • • + ku^V^ = HU^V^ + M2'y2 I 1 '^n^n) = *=(«* ' ■")
(iii) U'V = MjDi + M2''^2 + ■ • • + Mn''^n = '"l"! + ■''2'*2 + ' " ' + 1>n^n = V'U
(iv) Since wf is nonnegative for each i, and since the sum of nonnegative real numbers is non
negative, 2 I 2 I J. ,.2 = n
Furthermore, u'U = iff Mj = for each i, that is, iff m = 0.
DISTANCE AND NORM IN R»
1.11. Find the distance d{u, v) between the vectors u and v where: (i) u = (1, 7), v = (6, 5);
(ii) «=(3,5,4), t; = (6,2,1); (iii) m = (5,3,2,4,1), ■?; = (2,1,0,7,2).
In each case use the formula d(u, v) = VW  v{)^ + •••+(«„  vj^ .
(i) d(u,v) = V(l  6)2 + (7 + 5)2 = V25 + 144 = \/l69 = 13
(ii) d(u,v) = V(3  6)2 + (5  2)2 + (4 + 1)2 = V9 + 49 + 25 = a/83
(iii) d(u,v) = V(5  2)2 + (3 + 1)2 + (2 + 0)2 + (4 + 7)2 + (1  2)2 = \/47
1.12. Find k such that d{u, v) = 6 where « = (2, fc, 1, 4) and v = (3, 1, 6, 3).
(d(u, i;))2 = (2  3)2 + (fe + 1)2 + (1  6)2 + (4 + 3)2 = fe2 + 2fe + 28
Now solve fc2 + 2fc + 28 = 62 to obtain fc = 2, 4.
1.13. Find the norm m of the vector u if (i) u = (2, 7), (ii) u = (3, 12, 4).
In each case use the formula m1 = y/u^ + m2 4. . . . + ^2 ,
(i) IHI = V22 + (7)2 = V4 + 49 = ^53
(ii) 11^11 = V32 + (12)2 + (_4)2 = V9 + 144 + 16 = V169 = 13
10 VECTORS IN R" AND C» [CHAP. 1
1.14. Determine & such that tt = VS^ where u = {l,k,2,5).
ImI2 = 12 + fc2 + (2)2 + 52 = A;2 + 30
Now solve fc2 + 30 = 39 and obtain fc = 3, 3.
1.15. Show that m ^ 0, and m = ifi u = 0.
By Theorem 1.2, wu — O, and u'u = iff m = 0. Since m = yjiiu, the result follows.
1.16. Prove Theorem 1.3 (CauchySchwarz):
For any vectors u = {u\, . . . , m„) and v — (vi, . . .,Vn) in B", \uv\ ^ ]\u\\ \\v\\ .
n
We shall prove the following stronger statement: \u'v\ — 2 Mt'"tl ^ Im HvH.
If M = or V = 0, then the inequality reduces to — — and is therefore true. Hence we
need only consider the case in which m # and v j^ 0, i.e. where m # and i; # 0.
Furthermore,
\U'V\  IMjI)! + • • • + M„V„ ^ \UiVi\ + ••• + \u„vj = 2 kiVil
Thus we need only prove the second inequality.
Now for any real numbers w, 3/ G R, — (x — j/)2 = x^ — 2xy + y^ or, equivalently,
2xy ^ a;2 + 3/2 (1)
Set X = mj/m and y = Ifil/HvH in (1) to obtain, for any i,
IHI IHI IMP IWP ^^'
But, by definition of the norm of a vector, m = 2m,^ = 2 kiP and i; = S^f = 2v,2. Thus
summing {2) with respect to i and using \u(Vi\ = ImjI i;j, we have
2M ^ 2kP 2 It'iP IMI^ IMP
IMI IHI IMP IMP IMP IblP
2 ki^il
that is, II ,1 ,1 11  1
IMI IHI
Multiplying both sides by m H'wH, we obtain the required inequality.
1.17. Prove Minkowski's inequality:
For any vectors u{ui,...,Un) and v = {vi, . . .,Vn) in R", m + vH =^ tt + v.
If IIm + vjI = 0, the inequality clearly holds. Thus we need only consider the case m + i;1 #0.
Now JMj + V( — jwjl + i)j for any real numbers mj, Vj G R. Hence
\\u + v\\2 = 2(«i + i'<)=' = 2k + i'iP
= 2 ki + vil \ui + Vjj ^ 2 ki + Vil (kil + M)
= 2 ki + Vil \ui\ + 2 ki + Vil Ivjj
But by the CauchySchwarz inequality (see preceding problem),
2M+fjKI ^ Ik+^IIIMI and 2k + 'yilkl ^ Ik + HIIHI
Thus M + f2 ^ Im + i; IHI + m + H i;l = tt + i; (IMI + lbll)
Dividing by m v, we obtain the required inequality.
CHAP. 1] VECTORS IN R» AND C« 11
1.18. Prove that the norm in R" satisfies the following laws:
[Ni]: For any vector w, m^0; and H = iff u = 0.
[Nz]: For any vector M and any scalar A;, \\ku\\ = /cl m.
[Na]: For any vectors u and v, \\u + v\\ ^ \\u\\ + t;.
[Ni] was proved in Problem 1.15, and [Ng] in Problem 1.17. Hence we need only prove that
[Ni] holds.
Suppose u = (ui, ii2, . . ., u„) and so ku = (kui, ku^, .... ku^. Then
fcMl2 = (fcMi)2 + (kui)^ + ••• + (fcM„)2 = khi\ + khi\ + ••• + khil
The square root of both sides of the equality gives us the required result.
COMPLEX NUMBERS
1.19. Simplify: (i) (5 + 3i)(27i); (ii) (43*^; ("i) g"^'' (i^) ^; (v) »', i*. *"
,31.
(vi) (l + 2i)''; (vii)(2^)'
(i) (5 + 3t)(2  7i) = 10 + 6i  35i  21i? = 31  29i
(ii) (4  3t)2 = 16  24t + 9t2 = 7  24t
r"\ 1 _ (3 + 4t) _ 3 + 4i ^ A + Aj
^*"^ 3  4i (3  4i)(3 + 4t) 25 25 25
27i _ (2  7t)(5  3t) _ 11  41i _ _11_41.
^'^' 5 + 3t ^ (5 + 3i)(53t) 34 34 34
(v) ts = i^'i = (l)t = i; i* = v^"P = 1; P^ = (i*)7't^ = 1^ • (t) = i
(vi) (1 + 2i)8 = 1 + 6i + 121* + 8i^ = 1 + 6i  12  8i = 11  2i
. / 1 Y _ 1 _ (5 + 12t) _ 5 + 12i _ __5_ , 12
(vu) \^23iJ ~ 512i ~ (5  12t)(5 + 12t) 169 169
1.20. Let 2! = 2  3i and w = 4 + 5i Find:
(i) z + w and zw; (ii) z/w; (iii) « and w; (iv) \z\ and \w\.
(i) z + w = 2  3i + 4 + 5i = 6 + 2i
zw = (2  3i)(4 + 5t) = 8  12i + lOi  15t2 = 23  2i
f\ £.  23t _ (2  3i)(4  5i) _ 7  22i ^ _ 1. _ 22 .
169 ' 169*
w
4 + 6i (4 + 5i)(4  5t) 41 41 41
(iii) Use a+U = abi: « = 2  3t = 2 + 3t; w = 4 + 5i = 4  5t.
(iv) Use a+6i = V^^TP: \z\ = 2  3t = V4 + 9 = Vl3; \w\ = [4 + 5i =: Vl6 + 25= Vil.
1.21. Prove: For any complex numbers z,w G.C,
(i) z + w = z + w, (ii) zw = zw, (iii) z = z.
Suppose z = a+bi and w = e + di where a, b,c,d& R.
(i) z + w = (a+bi) + (e + di} = {a+c) + {b + d)i
= {a + e) — {b + d)i = a + c — bi — di
= (a — bi) + {e — di) = z + w
(ii) zw = (a+ bi)(c + di) = (ae — bd) + (ad+ bc)i
= {ac — bd) — (ad + bc)i = (abi)(c — di)  zw
(iii) z — a+bi — a — bi = a— (—b)i = a+bi = «
12 VECTORS IN R« AND C" [CHAP. 1
1.22. Prove: For any complex numbers z,w GC, \zw\ = \z\ \w\.
Suppose z = a + bi and w = c + di where a, b,c,dG R. Then
«2 = o2 + 62, w2 = c2 + d2, and zw = {ac  bd) + {ad + bc)i
Thus zw2 = (ac  6d)2 + (ad + 6c)2
= a2c2  2abcd + b^d? + aU^ + 2abcd + 62c2
= a^C^ + d2) + 62(c2 + d2) = ((j2 + 62)(c2 + (£2) = 22 ^2
The square root of both sides gives us the desired result.
1.23. Prove: For any complex numbers z,w€lC, \z + w\ — \z\ + \w\.
Suppose z = a + bi and w = c + di where a, 6, c, d € R. Consider the vectors u = (a, 6) and
V = (c, d) in R2. Note that
\z\ = Va2 + 62 = m, jwl = Vc2 + rf2 = j^
and \z + w\ = \{a + c) + (b + d)i\ = V(« + c)2 + (6 + d)2 = (a+ c, 6 + d) = \\u + v\\
By Minkowski's inequality (Problem 1.17), m + v — m + \\v\\ and so
\z + w\  \\u + v\\ ^ m + t; = z + lw
VECTORS IN C"
1.24. Let M = (3  2i, 4i, 1 + 6i) and v = {5 + i,23i, 5). Find:
(i) u + v, (il) 4m, (iii) (1 + i)v, (iv) (1  2i)u + (3 + i)v.
(i) Add corresponding components: u + v — (8 — t, 2 + i, 6 + 6t).
(ii) Multiply each component of u by the scalar 4t: 4m = (8 + 12i, —16, —24 + 4t).
(iii) Multiply each component of v by the scalar 1 + i:
(1 + t)i; = (5 + 6i + i2, 2  1  3i2, 6 + 5t) = (4 + 6i, 5  i, 5 + 5i)
(iv) First perform the scalar multiplication and then the vector addition:
(1  2i)u + (3 + i)v = (1  8i 8 + 4i, 13 + 4i) + (14 + 8i, 9  Ii, 15 + 5i)
= (13, 17  3i, 28 + 9t)
1.25. Find uv and vu where: (i) u = {l — 2i,S + i), v = (4 + 2i, 5 — 6i); (ii) u
(3  2i, 4i, 1 + 6i), v = (5 + i,2Si,l + 2i).
Recall that the conjugates of the second vector appear in the dot product:
(2l, . . ., Z„) • (Wi, ...,WJ = «iWi + • • ■ + 2„W„
i
(i) M • v = (1  2t)(4 + 2i) + (3 + t)(5  6t)
= (1  2i)(4  2i) + (3 + j)(5 + 6t) = lOi + 9 + 23i = 9 + 13i
vu = (4 + 2t)(l  2i) + (5  6i)(3 + 1)
= (4 + 2t)(l + 2i) + (5  6z)(3  i) = lOi + 9  23i = 9  13i
(ii) uv = (3  2i)(5 + i) + (4i)(2  3i) + (1 + 6i)(7 + 2i)
= (3  2i)(5  i) + (4i)(2 + 3i) + (1 + 6i)(7  2i) = 20 + 35i
ij • tt = (5 + i)(3  2i) + (2  3t)(4i) + (7 + 2i)(l + 6t)
= (5 + i)(3 + 2i) + (2  3i)(4t) + (7 + 2i)(l  6i) = 20  35i
In both examples, vu — wv. This holds true in general, as seen in Problem 1.27.
CHAP. 1] VECTORS IN R» AND C» 13
1.26. Find tt where: (i) m = (3 + 4i, 5  2i, 1  3i); (ii) u={4 i, 2i, 3 + 2t, 1  hi).
Recall that zz — w''+ 6^ when z = a+ hi. Use
mP  uu = ZiZi + Z2Z2 + • • • + z„Zn where s = (z^, Z2 z„)
(i) m1P = (3)2 + (4)2 + (5)2 + (2)2 + (1)2 + (3)2 = 64, or m == 8
(ii) m2 = 42 + (1)2 + 22 + 32 + 22 + 12 + (5)2 = 60, or m = ^60 = 2\/l5
1.27. Prove: For any vectors u,v EC" and any scalar z GC, (i) wv = vu, (ii) (zu) • v
z{u'v), (iii) u{zv) = z{u'v). (Compare with Theorem 1.2.)
Suppose u = {zi, «2, . . . , «„) and v = (wj, W2, • • ■ , w„).
(i) Using the properties of the conjugate established in Problem 1.21,
VU — WiZi + W2Z2 + ■ • • + W„Z„ = WiZi + W2Z2 + • • • + W„Zn
= WiZ^ + W2Z2 + • • • + W„Z„ = Z^Wi + Z2W2 + • • • + z„w„ = W V
(ii) Since zu = (zz^, zz2, . . ., zz^),
(zu) • V = ZZiWi + ZZ2W2 + • • • + ZZ„Wn = Z(ZiWi + «2M'2 + " " " + ^nWn) = «^(m * '")
(iii) Method 1. Since zv = (zwi, zwg, . . . , zwj,
U • (zv) = ZiZWl + Z^Wl + • • • + Z„ZW^ = ZiZWi + Z2ZW2 + • • • + «„2W„
= «(«iWi + Z^2 + • • • + Z^W^ = Z(U • V)
Method 2. Using (i) and (ii),
u • (zv) = (zv) • u = z(vu) = z(vu) = z(u'v)
MISCELLANEOUS PROBLEMS
1.28. Let u = (3, 2, 1, 4) and v = (7, 1, 3, 6). Find:
(i) u + v; (ii) 4u; (iii) 2u — Sv; (iv) uv; (v) m and i;; (vi) d{u,v).
(i) u + v  (3 + 7,2 + 1,13,4 + 6) = (10,1,2,10)
(ii) 4m = (4'3, 4'(2), 41, 44) = (12,8,4,16)
(iii) 2m  3i; = (6, 4, 2, 8) + (21, 3, 9, 18) = (15, 7, 11, 10)
(iv) uv = 2123 + 24 = 40
(v) 1m1 = V9 + 4 + 1 + 16 = VSO, v = V'49 + 1 + 9 + 36 = \/95
(vi) d(u,v) = V(3  7)2 + (2  1)2 + (1 + 3)2 + (4  6)2 = V45 = 3\/5
1.29. Let M = (7  2i, 2 + 5i) and v = (1 + i, 3  6i). Find:
(i) u + v; (ii) 2m; (iii) {Si)v; (iv) uv; (v) mI and v.
(i) u + v = (7  2i + 1 + i, 2 + 5i  3  6z) = (8  i, 1  i)
(ii) 2m = (14i  4i2, 4t + 10t2) = (4 + 14i 10 + 4t)
(iii) (3i)v = (3 + 3i  i  12, 9  ISi + 3i + 6P)  (4 + 2t, 15  15t)
(iv) uv = (7  2t)(TTt) + (2 + 5i)(3  6t)
= (7  2t)(l  1) + (2 + 5i)(3 + 6i) = 5  9i  36  3i = 31  12i
(v) IMI = V72 + (2)2 + 22 + 52 = V^, \\v\\ = Vl'^ + 1^ + (3)2 + (6)2 = V47
14
VECTORS IN R« AND C"
[CHAP. 1
1.30. Any pair of points P = {ou) and Q = (bi) in R» de
fines the directed line segment from P to Q, written
PQ. We identify PQ with the vector v = QP:
PQ = V = {bi — ai, 62  a2, . . .,bn a„)
Find the vector v identified with PQ where:
(i) P = (2,5), Q = (3.4)
(ii) P=(l,2,4), Q = (6,0,3)
(i) t) = QP = (32, 45) = (5,1)
(ii) V = QP = (6  1, + 2, 3  4) = (6, 2, 7)
1.31. The set H of elements in R" which are solutions of a linear equation in n unknowns
Xi, . . .,Xn of the form
CiXl + C2X2 + ■ • • + CnXn = & (*)
with u = (ci, . . ., c„) ^ in R", is called a kyperplane of R", and (*) is called an equa
tion of H. (We frequently identify H with (*).) Show that the directed line segment
PQ of any pair of points P,Q GH is orthogonal to the coefficient vector u; the vector
u is said to be normal to the hyperplane H.
Suppose P = («!, . . .,aj and Q = (fej, . . .,6„). Then the Oj and the 64 are solutions of the
given equation:
Cjai + C2a2 + • • ■ + c^an = b, c^bi + C262 + • • • + c„&„ = 6
Let v = PQ = QP^ {b,~ai,b2a^, ...,b„a„)
Then m • t; = Ci(&i  aj) + 62(63  Og) + • • • + c„(6„  aj
= C161  citti + C262  C2a2 + • • • + c„fe„  c„a„
= (Ci6i + C262 + • • • + c„bj  (CjOi + 02(12 + • • • + c„o„) = 66 =
Hence v, that is, PQ, is orthogonal to u.
1.32. Find an equation of the hyperplane H in R* if: (i) H passes through P = (3, 2, 1, 4)
and is normal to m = (2,5,6,2); (ii) H passes through P = (1,2, 3, 5) and is
parallel to the hyperplane H' determined by 4:X — 5y + 2z + w = 11.
(i) An equation of H is of the form 2x + 5y — Qz — 2w = k since it is normal to u. Substitute
P into this equation to obtain k = —2. Thus an equation of H is 2x + 5y — 6z — 2w = —2.
(ii) H and H' are parallel iff corresponding normal vectors are in the same or opposite direction.
Hence an equation of H is of the form 4x — 5y + 2z + w = k. Substituting P into this equa
tion, we find k = 25. Thus an equation of H is 4:X — 5y + 2z + w = 25.
1,33. The line I in R" passing through the point P = (a,)
and in the direction of u= (Ui) ¥= consists of the
points X — P + tu, t GK, that is, consists of the
points X = (Xi) obtained from
ai I Uit
(*)
Xi
a;2 = ^2 f U2t
n — an ] Unt
where t takes on all real values. The variable t is
called a parameter, and (*) is called a parametric rep
resentation of I.
CHAP. 1] VECTORS IN B," AND C" 15
(i) Find a parametric representation of the line passing through P and in the direc
tion of tt where: (a)P = (2,5) and m = (3,4); (6) P = (4, 2, 3, 1) and u =
(2,5,7,11).
(ii) Find a parametric representation of the line passing through the points P and Q
where: (a)P = (7,2) and Q = (9,3); (6) P = (5, 4, 3) and Q = (l,3,2).
(i) In each case use the formula (*).
X = 4 + 2t
'x = 2  3t
\y = 5 + 4t
y = 2+ 5t
z = 3  7t
w = 1 + nt
(In K2 we usually eliminate t from the two equations and represent the line by a single
equation: Ax + Zy = 23.)
(ii) First compute u = PQ = Q — P. Then use the formula (*).
(a) u = QP = (2,5) (b) u = QP = (4,7,5)
(x = 1 + 2t
= 2 + 5«
X =
5 At
V =
4 It
z =
3 + 5t
(Note that in each case we could also write u — QP — P — Q.)
Supplementary Problems
VECTORS IN R»
1.34. Let u = (1,2,5), i; = (3,l,2). Find: (i) u + v; (ii) 6m; (iii) 2u  5v; (iv) uv; (v) m1
and t;; (vi) d(u,v).
1.35. Let w = (2, 1, 0, 3), i; = (1, 1, i; 3), w = (1,3,2,2). Find: (i) 2u  3v; (ii) 5u  3v  Aw;
(iii) —u + 2v — 2w, (iv) u'v,u'W and vw; (v) d(u,v) and d(v,w).
1.36. Let M = (2,1,3,0,4), t; = (5, 3, 1, 2, 7). Find: (i) u + v; (ii) 3u  2v; (iii) M'^y; (iv) mI
and llull; (v) d(u,v).
1.37. Determine & so that the vectors M and 1) are orthogonal, (i) u = (3,k,—2), t; = (6, — 4, — 3). (ii) u —
(5, A;, 4, 2), i; = (1, 3, 2, 2fe). (iii) m = (1, 7, fe + 2, 2), v = (3,k,S,k).
1.38. Determine a; and 2/ if: (i) (x,x + y) = (y 2,6); (ii) x(l,2) = 4(y,3).
1.39. Determine a; and J/ if : (i) x(3,2) = 2(y,l); (ii) x(2,y) = y(l,2).
1.40. Determine a;, y and « if:
(i) (3,1,2)  a:(l, 1, 1) + j/(l, 1, 0) + z(l, 0, 0)
(ii) (1,3,3) = a;(l, 1, 0) + j/(0, 0, 1) + z(0, 1, 1)
1.41. Let «! = (!, 0,0), 62 = (0,1,0), eg = (0,0,1). Show that for any vector u = (a,b,c) in &:
(i) M = aei + 6e2 + ceg; (ii) m • ej = a, m • 62 = 6, m • 63 = c.
16 VECTORS IN R« AND C» [CHAP. 1
1.42. Generalize the result in the preceding problem as follows. Let ej e R" be the vector with 1 in the
ith coordinate and elsewhere:
ei = (1,0,0, ...,0,0), 62 = (0,1,0 0,0), ..., e„ = (0, 0,0, . . .,0, 1)
Show that for any vector u — (aj, ag, . . .,«„),
(i) u = aiCi + (1262 + • • • + a„e„, (ii) m • ej = Oj for i = 1, . ..,n
1.43. Suppose M e K" has the property that m • 1) = for every v S R". Show that it = 0.
1.44. Using d(u,v) = \\uv\\ and the norm properties [ATj], [iVj] and [N3] in Problem 1.18, show that
the distance function satisfies the following properties for any vectors u,v,w G R":
(i) d(u,v)^Q, and d{u, v) = itl u = v; (ii) d(u,v) = d(v,u); (iii) d{u,w) ^ d{u,v) + d(v,w).
COMPLEX NUMBERS
1.45. Simplify: (i) (4  7t)(9 + 2i); (ii) (35i)2; (iii) j^^; (iv)3; (v) (li)^.
1.46. Simplify: (i) ^; (ii) f^; (iii) i^s, ^2^ ^^*; (iv) (^3^ .'
1.47. Let z = 25i and w = 7 + 3i Find: (i) « + w; (ii) zw; (iii) «/«; (iv) «, w; (v) 1«, lw .
1.48. Let z = 2 + i and w = 6  5t. Find: (i) z/w; (ii) «, w; (iii) 1«1, w.
1.49. Show that: (i) ««i = 1; (ii) \z\ = \z\; (iii) real part of « = ^(z + z); (iv) imaginary part of
« = (z — z)/2i.
1.50. Show that zw = implies z = or w = 0.
VECTORS IN C»
1.51. Let It = (1 + 7i, 2  6i) and v{5 2i, 3  4t). Find: (i) u + v; (ii) (3 + i)u; (iii) 2m + (4  7t)v;
(iv) wwandvw; (v) 1m1 and t)l.
1.52. Let M = (3  7t, 2i, 1 + i) and 1; = (4  i, 11 + 2i, 8  Si). Find: (i) m  v; (ii) (3 + i)v; (iii)
tfvandvM; (iv) 1m and Hiill .
1.53. Prove: For any vectors u,v,w G C":
(i) {u + v)w = WW + vw; (ii) w (u + v) = wu + wv. (Compare with Theorem 1.2.)
1.54. Prove that the norm in C" satisfies the following laws:
[Ni]: For any vector u, \\u\\ ^ 0; and \\u\\ =0 iff m = 0.
[A^a]: For any vector u and any complex number z, 1zm = \z\ \\u\\.
[Ns]: For any vectors u and v, \\u + v\\ ^ H + i'l.
(Compare with Problem 1.18.)
MISCELLANEOUS PROBLEMS
1.55. Find an equation of the hyperplane in R3 which:
(i) passes through (2, —7, 1) and is normal to (3, 1, —11);
(ii) contains (1, 2, 2), (0, 1, 3) and (0, 2, 1);
(iii) contains (1, 5, 2) and is parallel to 3x  ly + iz = 5.
1.56. Determine the value of k such that 2x  ky + 4z  5w = 11 is perpendicular to 7x + 2y z +
2w = 8. (Two hyperplanes are perpendicular iff corresponding normal vectors are orthogonal.)
CHAP. 1] VECTORS IN R» AND C 17
1.57. Find a parametric representation of the line which:
(i) passes through (7, —1, 8) in the direction of (1, 3, —5)
(ii) passes through (1,9,4,5) and (2,3,0,4)
(iii) passes through (4,1,9) and is perpendicular to the plane Sx — 2y + z — 18
1.58. Let P, Q and R be the points on the line determined by
*i = Oi + Mjt, a;2 = 02 + Mgt, . . . , x^ — a^\ u„t
which correspond respectively to the values ti, ^2 ^^^ *3 ^or t. Show that if tj < (2 < Ht then
d(P,Q) ■¥ d(Q,R) = d(P,R).
Answers to Supplementary Problems
1.34. (i) u + v = (4, 1, 3); (ii) 6m = (6, 12, 30); (iii) 2m  5i; = (13, 9, 20); (iv) m • y = 9;
(v) m = V30, i; = a/U ; (vi) d{u,v) = ^62
1.35. (i) 2uZv = (1, 1, 3, 15); (ii) 5u3v4w = (3, 14, 11, 32); (iii) m + 2i;  2w = (2,7, 2, 5);
(iv) wv — —6, M • w = —7, vw = 6; (v) d(u, v) = \/38 , d(v, w) = 3'\/2
1.36. (i) M + r = (7, 2, 4, 2, 11); (ii) 3u2v = (4, 9, 7, 4, 2); (iii) uv = 38; (iv) m1 = VSO ,
i;=2V22; (v) d(M,i>) = V^
1.37. (i) k = 6; (ii) fc = 3; (iii) k = 3/2
1.38. (i) » = 2, J/ = 4; (ii) a; = 6, j/ = 3/2
1.39. (i) x = l, y  3/2; (ii) x = 0, y = 0; or w = 2, 2/ = 4
1.40. (i) x = 2, y = S, g = 2; (ii) a; = 1, j/ = 1, » = 4
1.43. We have that m • m = which implies that m = 0.
1.45. (i) 5055i; (ii) 16  30i; (iii) (4 + 7t)/65; (iv) (H3i)/2; (v) 2  2i.
1.46. (i) ^i; (ii) (5 + 27t)/58; (iii) i, i, 1; (iv) (4 + 3i)/50.
1.47. (i) z + w = 9  2t; (ii) zw = 29  29i; (iii) z/w = (1  41t)/58; (iv) z = 2 + 5i, w = 7  3f,
(v) z = V29, M = V5^
1.48. (i) z/w = (7 + 16t)/61; (ii) z = 2  i, w = 6 + 5t; (iii) \z\ = y/5 , m; = Vei .
1.50. If zw = 0, then \zw\ = \z\ \w\  0 = 0. Hence \z\ = or lw =0; and so z = or w = 0.
1.51. (i) M + t) = (6 + 5i, 5  lOi) (iv) uv  21 + 27i, vu = 21  27i
(ii) (3 + t)u = (4 + 22t, 12  16t) (v) m1 = 3VlO , i; = 3a/6
(iii) 2iu + (4  7i)v = (8  41i, 4  33i)
1.52. (i) uv = (1  ei, 11, 9 + 4t) (iii) uv  12 + 2i, vu = 12  2i
(ii) (3 + t)i; = (13 + i, 31 + 17i, 27i) (iv) m = 8, JHI = ■v/215
1.55. (i) Sx + yllz = 12; (ii) 13x + 4j/ + z = 7; (iii) 3»  7i/ + 4z = 46.
1.56. k0
1.57. (i) (x = T + t (ii) x = 1 + t (iii)
y = 9  12*
z = 4 + 4t
w = 5 — t
chapter 2
Linear Equations
INTRODUCTION
The theory of linear equations plays an important and motivating role in the subject
of linear algebra. In fact, many problems in linear algebra are equivalent to studying a
system of linear equations, e.g. finding the kernel of a linear mapping and characterizing
the subspace spanned by a set of vectors. Thus the techniques introduced in this chapter
will be applicable to the more abstract treatment given later. On the other hand, some of
the results of the abstract treatment will give us new insights into the structure of "con
crete" systems of linear equations.
For simplicity, we assume that all equations in this chapter are over the real field R. We
emphasize that the results and techniques also hold for equations over the complex field C
or over any arbitrary field K.
LINEAR EQUATION
By a linear equation over the real field R, we mean an expression of the form
aiXi + aix^ + • • • + anXn = h {!)
where the ai, & G R and the Xi are indeterminants (or: unknowns or variables). The scalars
ttt are called the coefficients of the Xi respectively, and b is called the constant term or simply
constant of the equation. A set of values for the unknowns, say
Xl = k\, Xi= ki, . . ., Xn= kn
is a solution of (i) if the statement obtained by substituting h for Xi,
ttifci + a2k2 + • • • + a„fc„ = b
is true. This set of values is then said to satisfy the equation. If there is no ambiguity
about the position of the unknowns in the equation, then we denote this solution by simply
thewtuple ,, , , ,
U = (fcl, fe, . . . , kn)
Example 2.1 : Consider the equation x + 2y — 4z + w = S.
The 4tuple u  (3, 2, 1, 0) is a solution of the equation since
3 + 2«24'l + = 3 or 3 = 3
is a true statement. However, the 4tuple v = (1, 2, 4, 5) is not a solution of the
equation since 1 + 2.244 + 5 = 3 or 6 = 3
is not a true statement.
Solutions of the equation (1) can be easily described and obtained. There are three
cases:
Case (i): One of the coefficients in (1) is not zero, say ai ¥ 0. Then we can rewrite the
equation as follows
axxi = b a2X2  • • • — anX„ or Xi = a^^b  a^^aiXi  ■ ■ • — aj" a„x„
18
CHAP. 2] LINEAR EQUATIONS 19
By arbitrarily assigning values to the unknowns X2, . . .,x„, we obtain a value for Xi; these
values form a solution of the equation. Furthermore, every solution of the equation can
be obtained in this way. Note in particular that the linear equation in one unknown,
ax = h, with a¥'Q
has the unique solution x — a^^b.
Example 2.2: Consider the equation 2x — 4y + z — 8.
We rewrite the equation as
2a; = 8 + 4j/ — 2 or x = A + 2y — ^z
Any value for y and z will yield a value for x, and the three values will be a solution
of the equation. For example, let 2/ = 3 and z = 2; then x = 4 + 2'3 — ^2 = 9.
In other words, the 3tuple u = (9, 3, 2) is a solution of the equation.
Case (ii): All the coefficients in (1) are zero, but the constant is not zero. That is, the
equation is of the form
Ojci + 0*2 + • • • + Oa;„ = 6, with 6 #
Then the equation has no solution.
Case (ill): All the coefficients in (1) are zero, and the constant is also zero. That is,
the equation is of the form
Oxi + Oa;2 + • • • + Oa;„ =
Then every ntuple of scalars in R is a solution of the equation.
SYSTEM OF LINEAR EQUATIONS
We now consider a system of m linear equations in the n unknowns Xi, . . ., x„:
ttiiXi + ai2a;2 + • • ■ + ainXn = &i
a2ia;2 + a22X2 + • • • + a2nXn =62
(*)
OmlXl + am2X2 + " • • + OmnXn = &m
where the Oij, bi belong to the real field R. The system is said to be homogeneous if the con
stants 61, . . . , 6m are all 0. An ?ituple u = (fci, . . .,kn) of real numbers is a solution (or:
a particular solution) if it satisfies each of the equations; the set of all such solutions is
termed the solution set or the general solution.
The system of linear equations
+ ai2a;2 + • • ■ + ai„Xn =
4 n^aVa 4 ... I nnf ^ n
(**)
is called the homogeneous system associated with (*). The above system always has a solu
tion, namely the zero wtuple = (0, 0, . . . , 0) called the zero or trivial solution. Any
other solution, if it exists, is called a nonzero or nontrivial solution.
The fundamental relationship between the systems (*) and (**) follows.
anXi
aziXi
+ ai2X2 + ■ ■
+ a22X2 + • •
• ■ + ai„a;„ =
■ • + a2nXn 
amlXi
+ 0^2X2 + •
• • + OmnXn 
20 LINEAR EQUATIONS [CHAP. 2
Theorem 2.1: Suppose m is a particular solution of the nonhomogeneous system (*) and
suppose W is the general solution of the associated homogeneous system (**).
u + W = {u + w: w G W)
is the general solution of the nonhomogeneous system (*).
We emphasize that the above theorem is of theoretical interest and does not help us to
obtain explicit solutions of the system (*). This is done by the usual method of elimination
described in the next section.
SOLUTION OF A SYSTEM OF LINEAR EQUATIONS
Consider the above system (*) of linear equations. We reduce it to a simpler system as
follows:
Step 1. Interchange equations so that the first unknown xi has a nonzero coeffi
cient in the first equation, that is, so that an ¥0.
Step 2. For each i> 1, apply the operation
Li * —anLi + ttiiLi
That is, replace the ith linear equation Li by the equation obtained by mul
tiplying the first equation Li by an, multiplying the ith equation L, by
an, and then adding.
We then obtain the following system which (Problem 2.13) is equivalent to (*), i.e. has
the same solution set as (*):
anaii + ai2X2 + a'laXs + • • • + a'mXn = &i
ayja^jg + + 0,2nXn = &2
amJ2^i2 "I" ^" dmnXn = Om
where an ¥= 0. Here Xj^ denotes the first unknown with a nonzero coefficient in an equation
other than the first; by Step 2, Xj^ ¥^ Xi. This process which eliminates an unknown from
succeeding equations is known as (Gauss) elimination.
Example 2.3: Consider the following system of linear equations:
2x + iy  z + 2v + 2w  1
3x + 6y + z — V + iw = —7
4x + 8y + z + 5v — w = 3
We eliminate the unknown x from the second and third equations by applying the
following operations:
L2 ^ 3Li + 2L2 and L3 > 2Li + Lg
We compute SLj: 6x  12y + 3z  6v  6w = 3
21,2: ex + 12y + 2z  2v + Sw  14
3Li + 2L2: 5z  8v + 2w = 17
and —21,1: 4* — 8y + 2z — 4v  'iw = 2
I/g: 4x + Sy + z + 5v — w = 3
2Li + L3: 3z + V  5w = 1
CHAP. 2] LINEAR EQUATIONS 21
Thus the original system has been reduced to the following equivalent system:
2x + iy  z + 2v + 2w = 1
5z  8v + 2w = 17
32 + V — 5w = 1
Observe that y has also been eliminated from the second and third equations. Here
the unknown z plays the role of the unknown Xj above.
We note that the above equations, excluding the first, form a subsystem which has
fewer equations and fewer unknowns than the original system (*). We also note that:
(i) if an equation Oa;i + • • • + Oa;„ = 5, b ¥=0 occurs, then the system is incon
sistent and has no solution;
(ii) if an equation Oaji + • • • + Oa;„ = occurs, then the equation can be deleted
without affecting the solution.
Continuing the above process with each new "smaller" subsystem, we obtain by induction
that the system (*) is either inconsistent or is reducible to an equivalent system in the
following form
aiiXi + ai2X2 + ttisccs + + (iinX„ = bi
Cl'2uXj, + a2.j,+ lXi„ + 1 + + a2nXn = &2 , ^
Ciri^Xj^ +ar,j,+ ia;j^+i + • • • + arnX„ = br
where 1 < ^2 < • • • < ^r and where the leading coefficients are not zero:
an ¥ 0, a2i2 ^ ^' •■ > ^^^r ^ ^
(For notational convenience we use the same symbols an, bk in the system (***) as we used
in the system (*), but clearly they may denote different scalars.)
Definition: The above system (***) is said to be in echelon form; the unknowns Xi which
do not appear at the beginning of any equation {iy^lyjz, . . ., jr) are termed
free variables.
The following theorem applies.
Theorem 2.2: The solution of the system (***) in echelon form is as follows. There are
two cases:
(i) r — n. That is, there are as many equations as unknowns. Then the
system has a unique solution.
(ii) r <n. That is, there are fewer equations than unknowns. Then we
can arbitrarily assign values to the n — r free variables and obtain a
solution of the system.
Note in particular that the above theorem implies that the system (***) and any equiv
alent systems are consistent. Thus if the system (*) is consistent and reduces to case (ii)
above, then we can assign many different values to the free variables and so obtain many
solutions of the system. The following diagram illustrates this situation.
22
LINEAR EQUATIONS
fCHAP. 2
System of linear equations
Inconsistent
Consistent
No
solution
Unique
solution
More than
one solution
In view of Theorem 2.1, the unique solution above can only occur when the associated
homogeneous system has only the zero solution.
Example 2.4: We reduce the following system by applying the operations L2 * —ZL^ + 2L2 and
L3 » — 3Li + 2L3, and then the operation L3 * —SL^ + Lgi
2x + y  2z + 3w = 1
Sx + 2y  z + 2w = 4
3a; + 32/ + 3z  3w = 5
2x + y  22 + 3w = 1
2/ + 4z — 5w = 5
Zy + 12z  15w = 7
2x + y  2z + 3w = 1
y + Az — 5w = 5
= 8
The equation = 8, that is, Oa; + Oi/ + Oz + Ow = 8, shows that the original
system is inconsistent, and so has no solution.
Example 2.5: We reduce the following system by applying the operations L^ >
Lg » — 2Lt + Lg and I/4 * — 2Li + L4, and then the operations L3
and L4 » 21/2 + L^:
— Lj + L2,
» L2 — Ls
X + 2y  Sz
X + Sy + z
2x + 5y — 4z
4
11
13
2x + ey + 2z = 22
a; + 2]/  32 =
y + Az =
y + 2z =
2y + 8z
4
7
5
14
X + 2y  Sz = 4
2/ + 4z = 7
2z = 2
=
a; + 2?/  3z = 4
2/ + 4z = 7
2z = 2
Observe first that the system is consistent since there is no equation of the form
= 6, with 6 # 0. Furthermore, since in echelon form there are three equations
in the three unknowns, the system has a unique solution. By the third equation,
2 = 1. Substituting z = 1 into the second equation, we obtain y = Z. Substitut
ing z = 1 and y — Z into the first equation, we find x = 1. Thus a; = 1, y = Z
and z = 1 or, in other words, the 3tuple (1, 3, 1), is the unique solution of the
system.
Example 2.6: We reduce th© "following system by applying the operations L2
L3 » —5Li + L3, and then the operation L3 ^ — 2L2 + L^:
X + 2y
2x + 42/
5x + lOy
2z + Zw = 2
3z + 4w = 5
8z + llw = 12
X + 2y  2z + Zw = 2
z  2w = 1
2z — 4w = 2
x + 2y 2z + Zw = 2
z  2w = 1
2Li + L2 and
2y  2z + Zw = 2
z  2w = 1
=
CHAP. 2] LINEAR EQUATIONS 23
The system is consistent, and since there are more unknowns than equations in
echelon form, the system has an infinite number of solutions. In fact, there are
two free variables, y and w, and so a particular solution can be obtained by giving
y and w any values. For example, let w = 1 and y = —2. Substituting w = 1
into the second equation, we obtain z = 3. Putting w = X, z = 3 and 2/ = — 2
into the first equation, we find a; = 9. Thus a; = 9, y — —2, z = 3 and w = 1 or,
in other words, the 4tuple (9, 2, 3, 1) is a particular solution of the system.
Remark: We find the general solution of the system in the above example as follows.
Let the free variables be assigned arbitrary values; say, y = a and w = b.
Substituting wh into the second equation, we obtain 2 = 1 + 26. Putting
y = a, z = l + 2b and w = h into the first equation, we find a; = 4  2a + &.
Thus the general solution of the system is
a; = 4 — 2a + b, i/ = a, 2 = 1+ 26, w — h
or, in other words, (4 — 2a + 6, a, 1 + 26, 6), where a and 6 are arbitrary num
bers. Frequently, the general solution is left in terms of the free variables y
and w (instead of a and 6) as follows:
X = A — 2y + w, z = 1 + 2w or (4 — 2y + w,y,l + 2w, w)
We will investigate further the representation of the general solution of a
system of linear equations in a later chapter.
Example 2.7: Consider two equations in two unknowns:
a^x + 6jj/ = Cj
OgX + 622/ = C2
According to our theory, exactly one of the following three cases must occur:
(i) The system is inconsistent.
(ii) The system is equivalent to two equations in echelon form.
(iii) The system is equivalent to one equation in echelon form.
When linear equations in two unknowns with real coefficients can be represented
as lines in the plane R^, the above cases can be interpreted geometrically as follows:
(i) The two lines are parallel.
(ii) The two lines intersect in a unique point.
(iii) The two lines are coincident.
SOLUTION OF A HOMOGENEOUS SYSTEM OF LINEAR EQUATIONS
If we begin with a homogeneous system of linear equations, then the system is clearly
consistent since, for example, it has the zero solution = (0, 0, . . ., 0). Thus it can always
be reduced to an equivalent homogeneous system in echelon form:
aiiXi + ai2X2 + aisXs + + amXn =
a2i2Xj2 + a2.J2+lXj2+l + + a2nXn =
drj^Xj^ 1 ({■r.i^+lXj^+l + • • • + drnXn ^
Hence we have the two possibilities:
(i) r = w. Then the system has only the zero solution.
(ii) r <n. Then the system has a nonzero solution.
If we begin with fewer equations than unknowns then, in echelon form, r <n and
hence the system has a nonzero solution. That is,
24
LINEAR EQUATIONS
[CHAP. 2
Theorem 2.3: A homogeneous system of linear equations with more unknowns than
equations has a nonzero solution.
Example 2.8: The homogeneous system
X + 2y  3z+ w =
xSy + z 2w =
2x + y  Zz + 5w =
has a nonzero solution since there are four unknowns but only three equations.
Example 2.9:
X + y — z =
5j/ + 3z =
We reduce the following system to echelon form:
X + y — z = X + y — z =
2x3y + z  5y + 3z =
x4v + 2z  5y + 3z =
The system has a nonzero solution, since we obtained only two equations in the
three unknowns in echelon form. For example, let z = 5; then y = S and x — 2.
In other words, the 3tuple (2, 3, 5) is a particular nonzero solution.
Example 2.10: We reduce the following system to echelon form:
X + y  z = X + y  z = x + y  z =
2x + 4y  z = 2y + z = 2y + z =
Sx + 2y + 2z = y + 5z = llz =
Since in echelon form there are three equations in three unknowns, the system has
only the zero solution (0, 0, 0).
Solved Problems
SOLUTION OF LINEAR EQUATIONS
2xSy + 6z + 2v
2.1. Solve the system: y  ^z + v
5w = 3
= 1 .
Zw  2
The system is in echelon form. Since the equations begin with the unknowns x,y and v re
spectively, the other unknowns, z and w, are the free variables.
To find the general solution, let, say, z = a and w = 6. Substituting into the third equation,
•y  36 = 2 or y = 2 + 36
Substituting into the second equation,
J/  4a + 2 + 36 = 1 01
Substituting into the first equation,
2x  3(4a  36  1) + 6a + 2(2 + 36)  56 = 3
Thus the general solution of the system is
a; = 3a  56  2, j/ = 4a  36  1, z 
y = 4a  36  1
3a  56
■u = 2 + 36, w = 6
or (3a562, 4a36l, a, 2 + 36, 6), where a and 6 are arbitrary real numbers. Some texts
leave the general solution in terms of the free variables z and w instead of a and 6 as follows:
CHAP. 2] LINEAR EQUATIONS 25
X = Sz — 5w — 2
y = 4z — 3w — 1 or (3« — 5w — 2, 4z  3w  1, z, 2 + 3w, w)
V  2 + 3w
After finding the general solution, we can find a particular solution by substituting into the
general solution. For example, let a = 2 and 6 = 1; then
X = 1, J/ = 4, z = 2, v = 5, w = 1 or (1, 4, 2, 5, 1)
is a particular solution of the given system.
X + 2y3z = 1
2.2. Solve the system: 3a; — y + 2z = 7 .
5x + 3y  4z = 2
Reduce to echelon form. Eliminate x from the second and third equations by the operations
1,2 ♦ — 3Li + L2 and Lg * —5Li + L3:
3Li: 3x 6y+9z= 3 SLj: 5x  lOy + 15z = 5
L2: 3x y + 2z = 7 L3: 5a: + 3^/  4z = 2
3Li + Lg: 7j/ + llz = 10 5Li + L3: 7j/ + llz = 7
Thus we obtain the equivalent system
X + 2y — 3z = 1
7y + llz = 10
7y + llz = 7
The second and third equations show that the system is inconsistent, for if we subtract we obtain
Ox + Oy + Oz = S or = 3.
2x+ y2z = 10
2.3. Solve the system: 3x + 2y + 2z = 1 .
5a; + 42/ + 32 = 4
Reduce to echelon form. Eliminate x from the second and third equations by the operations
L2 * 3Li + 21,2 and ^3 "* 5Li + 2L3:
31,1: 6x 3y + 6z = 30 SLj: lOx  5y + lOz = 50
21,2: 6x + 4j/ + 4z = 2 2L3: 10a; + 83/ + 6z = 8
3Li + 2L2: J/ + lOz = 28 5Li + 2L3: 3y + 16z = 42
Thus we obtain the following system from which we eliminate y from the third equation by the
operation L^ » — 3L2 + Lgi
2x + y  2z = 10 2x + y  2z  10
J/ + lOz = 28 to y + lOz = 28
3y + 16z = 42 142 = 42
In echelon form there are three equations in the three unknowns; hence the system has a unique
solution. By the third equation, z = —3. Substituting into the second equation, we find j/ = 2.
Substituting into the first equation, we obtain a; = 1. Thus x = l, y = 2 and z = 3, i.e. the 3tuple
(1, 2, —3), is the unique solution of the system.
26 LINEAR EQUATIONS [CHAP. 2
x + 2y3z = 6
2.4. Solve the system: 2x — y + iz = 2 .
4x + 3y2z = 14
Reduce the system to echelon form. Eliminate x from the second and third equations by the
operations Lz * —2Li + L^ and L3 » — 4In + L3:
2Li. 2x  4j/ + 6« = 12 4Li: ix  8y + 12z = 24
L2 2x y+ 4:Z = 2 L3: 4* + 32/  2« = 14
5j/ + 10« = 10 5y + lOz = 10
or y  2z  2 or y  2z = 2
X + 2y  3z = 6
y  2z = 2
Thus the system is equivalent to
x + 2y Sz = 6
y — 2z = 2 or simply
y2z = 2
(Since the second and third equations are identical, we can disregard one of them.)
In echelon form there are only two equations in the three unknowns; hence the system has an
infinite number of solutions and, in particular, 3 — 2 = 1 free variable which is z.
To obtain the general solution let, say, z  a. Substitute into the second equation to obtain
J/ = 2 + 2a. Substitute into the first equation to obtain a + 2(2 + 2a)  3o = 6 or a; = 2 — o.
Thus the general solution is
a; = 2  a, y = 2 + 2a, z = a or (2  a, 2 + 2o, o)
where a is any real number.
The value, say, o = 1 yields the particular solution a; = 1, 3/ = 4, « = 1 or (1,4, 1).
xSy + 4:Z — 2w = 5
2.5. Solve the system: 2y + 5z + w = 2 .
ySz =4
The system is not in echelon form since, for example, y appears as the first unknown in both
the second and third equations. However, if we rewrite the system so that w is the second unknown,
then we obtain the following system which is in echelon form:
a; — 2w — 32/ + 4z = 5
w + 2y + Bz = 2
2/  3z = 4
Now if a 4tuple (a, 6, c, d) is given as a solution, it is not clear if 6 should be substituted for
w or for y; hence for theoretical reasons we consider the two systems to be distinct. Of course this
does not prohibit us from using the new system to obtain the solution of the original system.
Let z = a. Substituting into the third equation, we find yA + Za. Substituting into the
second equation, we obtain w + 2(4 + 3a) + 5a = 2 or w = 6  11a. Substituting into the first
equation, ^ _ ^^_^ _ ^^^^ _ 3(4 + Sa) + 4a = 5 or a; = 5  17o
Thus the general solution of the original system is
a; = 5  17o, J/ = 4 + 3a, z = a, w = 6  11a
where o is any real number.
CHAP. 2] UNEAR EQUATIONS 27
2.6. Determine the values of a so that the following system in unknowns x, y and z has:
(i) no solution, (ii) more than one solution, (iii) a unique solution:
X + y — z — 1
2x + Zy + az = 3
X + ay + Sz = 2
Reduce the system to echelon form. Eliminate x from the second and third equations by the
operations Lj ^ — 2Li + L^ and I/3 ♦ — I/j + Lg:
2Li.
2x 2y + 2z =
2x + 3y+ az 
2
3
—X— y + z =
X + ay + Sz =
1
2
y + (a + 2)z =
1
(al)y + 4z =
1
Thus the equivalent system is
X + y — z = 1
J/ + (a + 2)z = 1
(al)y+ 4z = 1
Now eliminate y from the third equation by the operation L3 ^ —(a — V^L^ + L^,
(a  1)1,2: (a  1)2/ + (2  a  a2)z = 1 a
La: (al)y + 4z = 1
(6  a  a^)z  2 a
or (3 + a)(2  a)z = 2  a
to obtain the equivalent system
X V y — z = 1
J/ + (a + 2)z = 1
(3 + a)(2  a)z = 2  a
which has a unique solution if the coefficient of z in the third equation is not zero, that is, if a # 2
and a ¥= —3. In case a = 2, the third equation is = and the system has more than one solu
tion. In case a = —3, the third equation is = 5 and the system has no solution.
Summarizing, we have: (i) a = —3, (ii) o = 2, (iii) a¥' 2 and o ¥= —3.
2.7. Which condition must be placed on a, b and c so that the following system in unknowns
X, y and z has a solution?
X + 2y — Sz — a
2x + eyllz  b
X — 2y + Iz — c
Reduce to echelon form. Eliminating x from the second and third equation by the operations
1/2 ^ — 2Li + L2 and I/3 » — Lj + L3, we obtain the equivalent system
X + 2y — Sz = a
2y — 5z = b — 2a
— 4j/ 4 lOz = c — a
Eliminating y from the third equation by the operation Lg * 2L2 + L3, we finally obtain the
equivalent system
X + 2y — 3z — a
2y — 5z = b — 2a
= c + 26  5o
28 LINEAR EQUATIONS [CHAP. 2
The system will have no solution if the third equation is of the form = fe, with k ^ 0; that is,
if c + 2b — 5a ¥= 0. Thus the system will have at least one solution if
c + 26  5a = or 5a = 26 + c
Note, in this case, that the system will have more than one solution. In other words, the system
cannot have a unique solution.
HOMOGENEOUS SYSTEMS OF LINEAR EQUATIONS
2.8. Determine whether each system has a nonzero solution: , o , _ ,  n
x2y + Zz2w = x + 2y3z = 2x + 5y + 2z =
3xly2z + 4kw ^ 2x + 5y + 2z = x + Ay + 7z =
Ax + Sy + 5z + 2w = Sx y4z = x+3y + Sz =
(i) (ii) ("i)
(i) The system must have a nonzero solution since there are more unknowns than equations.
(ii) Reduce to echelon form:
x + 2yZz = x + 2ySz = x + 2y  3z =
2x + 5y + 2z = to y + 8z = to y + 8z =
3x y4z = 7y + 5z = 61z =
In echelon form there are exactly three equations in the three unknowns; hence the system has
a unique solution, the zero solution.
(iii) Reduce to echelon form:
X + 2y  z = X + 2y  z =
2x + 5y + 2z = y + 4z = x + 2y  z =
a; + % + 7z = 2y + 8z = y + Az =
x + 3y + Sz = j/ + 4z =
In echelon form there are only two equations in the three unknowns; hence the system has a
nonzero solution.
2.9. The vectors Ui,...,Um in, say, R" are said to be linearly dependent, or simply
dependent, if there exist scalars ki,...,km, not all of them zero, such that
kiui + ■■• + kmiim = 0. Otherwise they are said to be independent. Determme
whether the vectors u, v and w are dependent or independent where:
(i) u = (1, 1, 1), V = (2, 3, 1), w = (8, 7, 1)
(ii) u = (1, 2, 3), V = (2, 3, 1), w = (3, 2, 1)
(iii) u = (ai, a2), v = (bi, 62), w = (ci, C2)
In each case:
(a) let XU + yv + zw = where x, y and z are unknown scalars;
(6) find the equivalent homogeneous system of equations;
(c) determine whether the system has a nonzero solution. If the system does, then the vectors are
dependent; if the system does not, then they are independent.
(i) Let XU + yv + zw = 0:
x{l, 1, 1) + j/(2, 3, 1) + z{8, 7, 1) = (0, 0, 0)
or (a;, a;, x) + (2j/, 3y, y) + (8«, 7z, z) = (0, 0, 0)
or (x + 2y + 8z, xSy Iz, x + y + z) = (0, 0, 0)
CHAP. 2] LINEAR EQUATIONS 29
Set corresponding components equal to each other and reduce the system to echelon form:
X + 2y + 8z = X + 2y + Sz = x + 2y + Sz = x + 2y + 8z 
xZy 7z = ~5y  15z = y + 3z = y + 3z =
X + y + z = Q 3y + 9z = y + Sz =
In echelon form there are only two equations in the three unknowns; hence the system has a
nonzero solution. Accordingly, the vectors are dependent.
Remark: We need not solve the system to determine dependence or independence; we only need
to know if a nonzero solution exists.
(ii) x(l, 2, 3) + 3/(2, 3, 1) + z{3, 2, 1) = (0, 0, 0)
(x, 2x, 3x) + (2y, 3y, y) + (3z, 2z, z) = (0, 0, 0)
(x + 2y + 3z, 2x + 3y + 2z, 3x  y + z) = (0, 0, 0)
x + 2y + 3z = x + 2y + 3z  Q x + 2y + 3z =
2x + 3y + 2z = 7y + Sz = 7y + 8z =
3x  y + z = 5y + lOz = 30z =
In echelon form there are exactly three equations in the three unknowns; hence the system has
only the zero solution. Accordingly, the vectors are independent.
(iii) x(,ai, 02) + y{bi, 62) + z(ci, C2) = (0, 0)
{ttix, a2x) + {byy, h^y) + (c^z, c^z) = (0, 0) and so "''*' '^ "'^ ~
02* + 62J/ + C2Z =
(dja; + 61J/ + CiZ, a2X + b^y + C2Z) = (0, 0)
The system has a nonzero solution by Theorem 2.3, i.e. because there are more unknowns than
equations; hence the vectors are dependent. In other words, we have proven that any three
vectors in R2 are dependent.
2.10. Suppose in a homogeneous system of linear equations the coefficients of one of the
unknowns are all zero. Show that the system has a nonzero solution.
Suppose «!, ...,«„ are the unknowns of the system, and Xj is the unknown whose coefficients
are all zero. Then each equation of the system is of the form
«i^i + • • • + ttji^ji + Oajj + aj + i^j + i + • • • + o„a;„ =
Then for example (0, . . .,0, 1, 0, . . .,0), where 1 is the ;th component, is a nonzero solution of each
equation and hence of the system.
MISCELLANEOUS PROBLEMS
2.11. Prove Theorem 2.1: Suppose m is a particular solution of the homogeneous system
(*) and suppose W is the general solution of the associated homogeneous system (**).
Then
u + W = {u + w: w G W}
is the general solution of the nonhomogeneous system (*).
Let U denote the general solution of the nonhomogeneous system (*). Suppose uG U and that
u = (% Un). Since m is a solution of (*), we have for t = 1, . . . , m,
Now suppose w 6 W and that w = (w^, . . ., w„). Since w is a solution of the homogeneous system
(**), we have for i = 1, . . .,m,
OiiWi + aj2W2 + • ■ • + fflin^^n =
30 LINEAR EQUATIONS [CHAP. 2
Therefore, for i = l, ...,nt,
0,i(Mi + Wi) + Oi2(M2 + W2) + • • • + ai„{Un + W„)
= OjiMi + OjiWi + aj2M2 + «t2"'2 + ' " ' + linMn + «inWn
= (OilMl + ai2M2 + • • • + «tnO + ("il^l + «i2W2 + ' " ' + «in«'n)
= 6i + = 6j
That is, M + w is a solution of (*). Thus u + w e^U, and hence
u + W c U
Now suppose V = (vi, . . . , vj is any arbitrary element of U, i.e. solution of (*). Then, for
t = 1, . . .,w,
ttji^i + ai2U2 + • • • + ai„v„ = bj
Observe that v = u+(v — u). We claim that vuGW. For i = 1, . . . , m,
ail(i;i — Ml) + ai2(t'2 — M2) + " * ' + »m(^n ~ "n)
= (OjlVl + aj2'y2 + • • ■ + ftin^n) ~ («il"l + «t2"2 + • • • + «{„«■„)
= 6i  6i =
Thus V  M is a solution of the homogeneous system (*), i.e. ■umST^. Then vGu+W, and hence
U Q u + W
Both inclusion relations give us U  u + W; that is, u+W is the general solution of the
nonhomogeneous system (**).
2.12. Consider the system (*) of linear equations (page 18). Multiplying the ith equation
by Ci, and adding, we obtain the equation
(CiOn + • • • + Cmaml)Xl + • • • + (Cittm + • • • + CmOmn)*™ = Ci5i + • • • + Cmbm (1)
Such an equation is termed a linear combination of the equations in (*). Show that
any solution of (♦) is also a solution of the linear combination (1).
Suppose M = (fci, . . . , fcj is a solution of (*). Then
ffliifci + aah + • • • + «{«*:„ = 6i, t = 1, . . .,m (2)
To show that tt is a solution of (1), we must verify the equation
(Ciaii + • • • + Cmd^Ofcl + • • • + ("ifflln + • • • + emO'mn)K = «!&! + " " " + C^fe^
But this can be rearranged into
Ci(anfel + • • • + 0,lnK) + • • • + Cm(aml + • • • + amn'«n) = Ci^l + ' " " + C^fem
or, by (2), Ci6i + • • • + cj)„^ = Cjbi + • • • + c^b^
which is clearly a true statement.
2.13. In the system (*) of linear equations, suppose an ¥= 0. Let (#) be the system ob
tained from (*) by the operation U ^ anLi + auU, i^l. Show that (*) and (#)
are equivalent systems, i.e. have the same solution set.
In view of the above operation on (*), each equation in (#) is a linear combination of equations
in (*); hence by the preceding problem any solution of (*) is also a solution of (#).
On the other hand, applying the operation !/{ * — (Oii^^i + iO to (#), we obtain the origi
nal system (*). That is, each equation in (*) is a linear combination of equations in (#); hence each
solution of (#) is also a solution of (*).
Both conditions show that (*) and (#) have the same solution set.
CHAP. 2] LINEAR EQUATIONS 31
2.14. Prove Theorem 2.2: Consider a system in echelon form:
aiia;i + ai^Xi + aizXz + + ainXn = bi
a2J2^J2 + a2,J2+l«J2 + l + + ainXn = 62
O'ri^^ir + <*r,]V+ia;j^+l + • • • + arnaJn = for
where 1<h< ■ ■ • < jr and where an ^ 0, a2J2 ^0, . . . , ari, ^ 0, The solution is as
follows. There are two cases:
(i) r = n. Then the system has a unique solution.
(ii) r <n. Then we can arbitrarily assign values to the n — r free variables and
obtain a solution of the system.
The proof is by induction on the number r of equations in the system. If r = 1, then we have
the single linear equation
aiXi + a^Xi + a^x^ + • • • + a„a;„ = 6, where Oj #•
The free variables are x^, ■ ■ ., a;„. Let us arbitrarily assign values to the free variables; say,
«2 = Iti, xs — k^, ...,«„ = fe„. Substituting into the equation and solving for Xi,
Xi = — (6  fflzfca  asks  ■■■  o„fc„)
"■1
These values constitute a solution of the equation; for, on substituting, we obtain
"i — (* "" 02*^2 — • • • — a„k„) + ajt^ + • • • + a„fc„ = 6 or 6 = 6
which is a true statement.
Furthermore if r = % = 1, then we have ax = b, where a # 0. Note that x = b/a is a solu
tion since a(b/a) = 5 is true. Moreover if x = k is a solution, i.e. ak = b, then k = b/a. Thus
the equation has a unique solution as claimed.
Now assume r > 1 and that the theorem is true for a system of r — 1 equations. We view the
r — 1 equations
'*2J2*J2 + '*2,J2+1*J2 + 1 "^ "^ "2na;n = *2
as a system in the unknowns Xj^ «„. Note that the system is in echelon form. By induction
we can arbitrarily assign values to the (n — J2 + 1) — (r — 1) free variables in the reduced system
to obtain a solution (say, Xj^ — fcj^, . . . , x„ — &„). As in case r = 1, these values and arbitrary
values for the additional J2 — 2 free variables (say, a;2 = ^2, • ■ •, a^j,! = 'fjai)' yield a solution
of the first equation with
Xi = — (6j  012^2  • • •  ai„k„)
Oil
(Note that there are (n — J2 + 1) — (r — 1) + (jg — 2) = n — r free variables.) Furthermore, these
values for Xi, . . .,x„ also satisfy the other equations since, in these equations, the coefficients of
«!,..., »j„i are zero.
Now if r = n, then 32 = 2. Thus by induction we obtain a unique solution of the subsystem
and then a unique solution of the entire system. Accordingly, the theorem is proven.
2.15. A system (*) of linear equations is defined to be consistent if no linear combination
of its equations is the equation
Oa;i + 0*2 + • • • + Oa;„ = b, where b ¥ (I)
Show that the system (*) is consistent if and only if it is reducible to echelon form.
Suppose (*) is reducible to echelon form. Then it has a solution which, by Problem 2.12, is a
solution of every linear combination of its equations. Since (1) has no solution, it cannot be a linear
combination of the equations in (*). That is, (*) is consistent.
On the other hand, suppose (*) is not reducible to echelon form. Then, in the reduction process,
it must yield an equation of the form (1). That is, (J) is a linear combination of the equations in (*).
Accordingly (*) is not consistent, i.e. (*) is inconsistent.
32
LINEAR EQUATIONS
[CHAP. 2
Supplementary Problems
SOLUTION OF LINEAR EQUATIONS
2x + Sy = 1
5x + 7y = 3
2.16. Solve:
(i)
(ii)
2x + 4y = 10
3« + 6y = 15
(iii)
Ax 2y = 5
6x + 3y = 1
2.17. Solve:
2x + y  Sz = 5
(i) Sx 2y + 2z = 5
5x Sy  z = 16
2x + 3y 2z = 5
(ii) «  21/ + 3« = 2
4a; — 1/ + 4z = 1
X + 2y + 3« = 3
(iii) 2x + 3y+ 8« = 4
3a; + 2j/ + 17? = 1
2.18. Solve:
(i)
2x + 3y = 3 x + 2y3z + 2w  2
X 2y  5 (ii) 2x + 5y  8z + Gw  5
3x + 2y = 7 3x + Ay  5z + 2w = 4
X + 2y  z + 3w = 3
(iii) 2x + Ay + 4z + 3w = 9
3x + 6j/  2 + 8w = 10
2.19. Solve:
(i)
X + 2y + 2z = 2
3x 2y  z = 5
2x  5y + 3z = A
X + Ay + 6z —
X + 5y + Az  13w = 3
(ii) 3a;  y + 2z + 5w = 2
2x + 2y + 3z  Aw = 1
2.20. Determine the values of fc such that the system in unknowns x, y and z has: (i) a unique solution,
(ii) no solution, (iii) more than one solution:
kx + y + z = 1
(o) x + ky + z = l (6)
X + y + kz = 1
X + 2y + kz = 1
2a; + fc« + 8« = 3
2.21. Determine the values of k such that the system in unknowns x, y and z has: (i) a unique solution,
(ii) no solution, (iii) more than one solution:
X + y + kz  2 X  3z = 3
(a) 3x + Ay + 2zk (6) 2x + ky  z = 2
2x + 3y — z = 1 X + 2y + kz  1
2.22. Determine the condition on a, b and c so that the system in unknowns x, y and z has a solution:
X + 2y — 3z = a x — 2y + Az = a
(i) 3x y + 2z  b (ii) 2x + Sy  z  b
X — 5y + Bz = e 3x + y + 2z = e
HOMOGENEOUS SYSTEMS
2.23. Determine whether each system has a nonzero solution:
x + 3y 2z = X + 3y 2z =
(i) x8y + 8z  (ii) 2x  3y + z =
3x2y + Az = 3x  2y + 2z =
X + 2y — 5z + Aw =
(iii) 2x  3y + 2z + 3w =
4x — 7j/ + z — 6w —
CHAP. 2] LINEAR EQUATIONS 33
2.24. Determine whether each system has a nonzero solution:
X 2y + 2z = 2x  4y + 7z + 4v  5w 
2x+ y  2z = 9x + 3y + 2z ^ 7v + w =
(i) (ii)
3x+ 4y 6z = 5x + 2y  3z + v + 3w 
3x  lly + 12z = 6x  5y + 4z 3v — 2w  Q
2.25. Determine whether the vectors u, v and w are dependent or independent (see Problem 2.9) where:
(i) u = (1, 3, 1), V = (2, 0, 1), w = (1, 1, 1)
(ii) u = (1, 1, 1), V = (2, 1, 0), w = (1, 1, 2)
(iii) u = (1, 2, 3, 1), V = (3, 2, 1, 2), w = (1, 6, 5, 4)
MISCELLANEOUS PROBLEMS
2.26. Consider two general linear equations in two unknowns x and y over the real field K:
ax + by = e
ex + dy = f
Show that:
(i) it ¥'2, i.e. if ad  6c ¥= 0, then the system has the unique solution x = ^^ ~ ^f
, ad — he
_ af — ee
^ ~ adbc'
(ii) i* 7 = J '^ 7. tlien the system has no solution;
(iii) ii — = 2 = f> then the system has more than one solution.
2.27. Consider the system ax + by = 1
ex + dy =
Show that if adbe¥'0, then the system has the imique solution x = d/(ad — be), y = —e/{ad  be).
Also show that if ad—be = 0,e¥'0 or d ^ 0, then the system has no solution.
2.28. Show that an equation of the form Oki + Oa;2 + * • • + Oa;„ = may be added or deleted from a
system without affecting the solution set.
2.29. Consider a system of linear equations with the same number of equations as unknowns:
fflii*! + ai2«2 + • • • + ai„x„ = 61
a^xi + a22«2 + • • • + a2„x„ = 62
(i)
Onl*! + 01.2*2 + • ■ • + «„„«„ = 6„
(i) Suppose the associated homogeneous system has only the zero solution. Show that (i) has a
unique solution for every choice of constants 6j.
(ii) Suppose the associated homogeneous system has a nonzero solution. Show that there are
constants 64 for which {!) does not have a solution. Also show that if {1) has a solution, then
it has more than one.
34
LINEAR EQUATIONS [CHAP. 2
2z — 2w
a; = 7/2  5w/2  2j/
Answers to Supplementary Problems
2.16. (i) X = 2, y = 1; (ii) a; = 5  2a, j/ = a; (iii) no solution
rx = 1  7z
2.17. (i) (1,3,2); (ii) no solution; (iii) {1  7a, 2 + 2a, a) or ^ ^ g + 2z
2.18. (i) a; = 3, 1/ = 1
ra; = — z + 2w
(ii) (a + 26, 1 + 2a  26, a, 6) o*" ] ^ +
(iii) (7/2  56/2  2a, a, 1/2 + 6/2, 6) or ^ ^ ^^^ + w/2
2.19. (i) (2, 1, 1); (ii) no solution
2.20. (a) (i) k¥'l and fe # 2; (ii) fc = 2; (iii) fe = 1
(6) (i) never has a unique solution; (ii) k — 4; (iii) fe t^ 4
2.21. (a) (i) fc # 3; (ii) always has a solution; (iii) fe = 3
(6) (i) fc ^ 2 and A; # 5; (ii) fc = 5; (iii) fe = 2
2.22. (i) 2a  6 + c = 0. (ii) Any values for a, b and c yields a solution.
2.23. (i) yes; (ii) no; (iii) yes, by Theorem 2.3.
2.24. (i) yes; (ii) yes, by Theorem 2.3.
2.25. (i) dependent; (ii) independent; (iii) dependent
chapter 3
Matrices
INTRODUCTION
In working with a system of linear equations, only the coefficients and their respective
positions are important. Also, in reducing the system to echelon form, it is essential to
keep the equations carefully aligned. Thus these coefficients can be efficiently arranged in
a rectangular array called a "matrix". Moreover, certain abstract objects introduced in
later chapters, such as "change of basis", "linear operator" and "bilinear form", can also
be represented by these rectangular arrays, i.e. matrices.
In this chapter, we will study these matrices and certain algebraic operations defined on
them. The material introduced here is mainly computational. However, as with linear
equations, the abstract treatment presented later on will give us new insight into the
structure of these matrices.
Unless otherwise stated, all the "entries" in our matrices shall come from some arbitrary,
but fixed, field K. (See Appendix B.) The elements of K are called scalars. Nothing essen
tial is lost if the reader assumes that K is the real field R or the complex field C.
Lastly, we remark that the elements of R" or C" are conveniently represented by "row
vectors" or "column vectors", which are special cases of matrices.
MATRICES
Let K be an arbitrary field. A rectangular array of the form
0,12 . . . din
0,22 . . . 0,2n
\Q,ml Om2 ... fflr
where the Odi are scalars in K, is called a matrix over K, or simply a matrix if K is implicit.
The above matrix is also denoted by (ohj), i = l, . . .,m, j = 1, . . .,n, or simply by (a«).
The m horizontal «tuples
(ail, ai2, . . . , ttln), (tt21, 0^22, . . . , a2n), . . ., {ami, am2, . . . , Omn)
are the rows of the matrix, and the n vertical wtuples
lai2X
a22
, • • . f
\am2l
are its columns. Note that the element ay, called the ijentry or ijcomponent, appears in
the ith row and the yth column. A matrix with m rows and n columns is called an w by «
matrix, or m x n matrix; the pair of numbers (m, n) is called its size or shape.
35
36 MATRICES [CHAP. 3
/I 3 4\
Example 3.1: The following is a 2 X 3 matrix: ( r _c, ) •
Its rows are (1, —3, 4) and (0, 5, —2); its columns are ( « ) . ( r ) and I j .
Matrices will usually be denoted by capital letters A,B, . . ., and the elements of the
field K by lower case letters a,b, . . . . Two matrices A and B are equal, written A = B, if
they have the same shape and if corresponding elements are equal. Thus the equality of
two mxn matrices is equivalent to a system of mn equalities, one for each pair of elements.
Example 3.2: The statement ( ^ " '")=(, .) is equivalent to the following system
. .. \xy zwj VI 4/
of equations:
x + y = 3
X — y = I
2z + w = 5
z — w = 4
The solution of the system is x = 2, y = 1, z = 3, w = —1.
Remark: A matrix with one row is also referred to as a row vector, and with one column
as a column vector. In particular, an element in the field K can be viewed as
a 1 X 1 matrix.
MATRIX ADDITION AND SCALAR MULTIPLICATION
Let A and B be two matrices with the same size, i.e. the same number of rows and of
columns, say, mxn matrices:
(an ai2 ... ain \ / ^u ^^2 ... bin
a21 022 ... CLin I . g _ I ^Hi ^22 ... ban
Oml ami . . . ffimn / \ &ml 6m2 . . . &mti
The sum of A and B, written A + J?, is the matrix obtained by adding corresponding entries:
an + &n ai2 + &12 ... am + bin
„ 1 a21 + &2I a22 + 622 ... a2n + ?>2n
A + B =
ami + bml Omi + &m2 . . • Omn +
The product of a scalar k by the matrix A, written fc • A or simply kA, is the matrix obtained
by multiplying each entry of A by k:
Ckaii kai2 ■ ■ ■ feain
fca2i ka22 . . . kazn
kaml fcOm2 . • . kOmn I
Observe that A+B and kA are also mxn matrices. We also define
A = 1A and A  B ^ A + {B)
The sum of matrices with different sizes is not defined.
CHAP. 3]
MATRICES
37
Example 3.3: Let A = (] ^ J\ and B = fj ° ^). Then
A + B
3A =
1 + 32 + 3 + 2
47 5 + 16 + 8
3*1 3 • (2) 3 • 3
3 '4 35 3 '(6)
c
2ASB = r ' ') + r ° "'
'8 10 12/ V2I 3 24
4 2
3 6
36 9
12 15 18
7
29
4
7
36
Example 3.4: The mXn matrix whose entries are all zero,
10 ...
...
,0 ... 0,
is called the zero matrix and will be denoted by 0. It is similar to the scalar in
that, for any mXn matrix A = (a^), A + = (a^ + 0) = (Oy) = A.
Basic properties of matrices under the operations of matrix addition and scalar multi
plication follow.
Theorem 3.1 : Let F be the set of all m x n matrices over a field K. Then for any matrices
A,B,C GV and any scalars ki, kz € K,
(i) (A+B) + C = A + {B + C) (v) k,{A + B) = kiA + kiB
(ii) A + = A (vi) {ki + fe)^ = kiA + k^A
(iii) A + (A) = (vii) (kiki)A = kiik^A)
(iv) A+B = B + A (viii) 1 A = A and OA =
Using (vi) and (viii) above, we also have that A + A = 2A,A + A + A = ZA, ... .
Remark: Suppose vectors in R" are represented by row vectors (or by column vectors);
say,
u — (ai, Oi, . . . , ttn) and v = (bi, 62, . . . , b„)
Then viewed as matrices, the sum u + v and the scalar product ku are as follows:
u + v  {ai + bi,a2 + b2,...,an + b„) and ku = (fcai, kaz, ..., A;a„)
But this corresponds precisely to the sum and scalar product as defined in
Chapter 1. In other words, the above operations on matrices may be viewed
as a generalization of the corresponding operations defined in Chapter 1.
MATRIX MULTIPLICATION
The product of matrices A and B, written AB, is somewhat complicated. For this
reason, we include the following introductory remarks.
(i) Let A = (Oi) and B = (bi) belong to R", and A represented by a row vector and B by a
column vector. Then their dot product A • B may be found by combining the matrices
as follows:
lbl\
AB = (tti, 02, . . .,a„) ( M = aibi + a2b2 + ■ ■ • + ttnbn
Wl
Accordingly, we define the matrix product of a row vector A by a column vector B as
above.
38 MATRICES [CHAP. 3
bnXi + biiXi + feisics = y\
(ii) Consider the equations (1)
h2lXl + b22X2 + b23X3 = 1/2
This system is equivalent to the matrix equation
6n b. b.s\h\ ^ fyA ^^^.^pjy ^^ ^ ^
&21 &22 &23/U3/ V^V
where B — (&«), X = (x,) and Y — (yi), if we combine the matrix B and the column
vector X as follows:
Dv _ ( "" "' ""!( 1 _ /feiiaJi + &i2a;2 + bisa^sN _ fBi'X
\b2iXl + b22X2 + b2SXs J \B2X
where Bi and B2 are the rows of B. Note that the product of a matrix and a column
vector yields another column vector.
auVi + ai22/2 = zi
(iii) Now consider the equations (2)
a2iyi + (i22y2 = Z2
which we can represent, as above, by the matrix equation
^aii ai2\/yi\ ( Zx
, , , , , or simply A Y = Z
^21 022/^2/2/ y22
where A = (Cij), Y = {yi) as above, and Z = (z^. Substituting the values of y\ and 1/2
of (i) into the equations of {2), we obtain
aii(&iia;i + 6i2a;2 + b\%x%) + ai2(62ia;i + 622332 + &23a:;3) = «i
a2i(&iia;i + &i2a;2 + bisXs) + a22(&2ia;i + &22a;2 + btzx^) = 22
or, on rearranging terms,
(ttubii + ai2&2i)a;i + (aii6i2 + ai2&22)a;2 + (an&is + a\2b23)Xz = Zi
(3)
(azi&u + 022&2i)a;i + («2i&i2 + a22&22)a;2 + (021613 + 022623)033 = 22
On the other hand, using the matrix equation BX = Y and substituting for Y into
AY = Z, we obtain the expression
ABX = Z
This will represent the system (3) if we define the product of A and B as follows:
ftii ffli2\/6n 612 bisX _ /aii6ii + 012621 011612 + 012622 011613 + 012623
021 022/1621 622 623/ YO21611 + 022621 021612 + 022622 021613 + O22623
AiB' ArB^ AiB^
A2jB' AaB^ A2'B^
where Ai and A2 are the rows of A and J?S B^ and B^ are the columns of B. We em
phasize that if these computations are done in general, then the main requirement is
that the number of yi in (1) and (2) must be the same. This will then correspond to the
fact that the number of columns of the matrix A must equal the number of rows of
the matrix B.
CHAP. 3]
MATRICES
39
With the above introduction, we now formally define matrix multiplication.
Definition: Suppose A = (a«) and B = (&«) are matrices such that the number of columns
of A is equal to the number of rows of B; say, A is an m x p matrix and B is a
pxn matrix. Then the product AB is the mxn matrix whose yentry is
obtained by multiplying the ith row A, of A by the yth column B' of B:
That is,
jail
dml
AB =
Aifii
AiB2 .
. . Ai5"
A2S1
A2B2 .
. . A2B"
,A„Bi AmB^
where cy = aiiftij + ai2&23 + • • • + avpbp.
Opn
P
1
fc = l
Am'B"!
/Cii ... Cm
Cii
\Cml ... Ci
= 2 Cifc&fci
We emphasize that the product AB is not defined if A is an m x p matrix and B is a
qxn matrix, where p ^ q.
Example 3.5:
r s
t u
<»i (H "3
6i 62 ^3
raj + s6i ra2 + 562 ^'is + S63
toi + m6i (02 + M^2 *"3 + '^^s
Example 3.6:
1 2
3 4
1 1
2
1 1
2
1 2
3 4
11 + 20 I'l + 22
31 + 40 3«l + 4'2
11 + 13 12 + 1'4
0'l + 23 0*2 + 2*4
1 5
3 11
4 6
6 8
The above example shows that matrix multiplication is not commutative, i.e. the products
AB and BA of matrices need not be equal.
Matrix multiplication does, however, satisfy the following properties:
Theorem 3.2: (i) iAB)C = A{BC), (associative law)
(ii) A{B + C) = AB + AC, (left distributive law)
(iii) (B + C)A = BA + CA, (right distributive law)
(iv) k{AB) = {kA)B = A{kB), where A; is a scalar
We assume that the sums and products in the above theorem are defined.
We remark that OA = and BO =^ where is the zero matrix.
TRANSPOSE
The transpose of a matrix A, written A*, is the matrix obtained by writing the rows of
A, in order, as columns:
/ttii 0.12 . . . Oln \ ' /ttli 0.21 . . . aTOl\
0,21 ffi22 . . . 02n \ / O12 ffl22 . . . Om2
^Oml Om2 . . . Omni \Oin 02n . ■ . OmnJ
Observe that if A is an m x « matrix, then A' is an w x m matrix.
40 MATRICES [CHAP. 3
/l 4^
Example 3.7: (J J _IJ = (2 5^
The transpose operation on matrices satisfies the following properties:
Theorem 3.3: (i) (A+B)* = A* + B*
(ii) (A')' = A
(iii) {kAy — kA\ for k a scalar
(iv) {ABy = B«A«
MATRICES AND SYSTEMS OF LINEAR EQUATIONS
The following system of linear equations
anXi + ai2X2 + • • • + aina;n = &i
a2iXi + a22X2 + • • • + annXn =62 n\
OmlXi + am2X2 + • • • + OmnXn
is equivalent to the matrix equation
/an ai2
a2i 022 ... a2n \IX2\ ^ lb2 \ or simply AX = B (2)
lOml fflm2
where A = (an), X = {Xi) and B = (&i). That is, every solution of the system {1) is a
solution of the matrix equation (2), and vice versa. Observe that the associated homogeneous
system of (1) is then equivalent to the matrix equation AX = 0.
The above matrix A is called the coefficient matrix of the system (1), and the matrix
/(111 O12 . . • ttin
tt21 tt22 • • • (^2n
^ttml (lm2 • . . Otnn
is called the augmented matrix of (1). Observe that the system (1) is completely determined
by its augmented matrix.
Example 3.8: The coefficient matrix and the augmented matrix of the system
2a; + 3j/  4z = 7
x2y 5z  3
are respectively the following matrices:
/2 3 4\ /2 34 7
(1 _2 5; ^*^ \l 2 5 3
Observe that the system is equivalent to the matrix equation
'2 3
1 2
X\ ,rj
In studying linear equations it is usually simpler to use the language and theory of
matrices, as indicated by the following theorems.
CHAP. 3] MATRICES 41
Theorem 3.4: Suppose Ui,U2, . . .,tin are solutions of a homogeneous system of linear
equations AX = 0. Then every linear combination of the m of the form
kiUi + kiUz + • • • + krOin where the fe are scalars, is also a solution of
AX = 0. Thus, in particular, every multiple ku of any solution u of
AX = is also a solution of AX = 0.
Proof. We are given that Aui — 0, Au2 = 0, . . . , Aun = 0. Hence
A{kui + kui + • • • + fettn) = kiAui + kiAu^ + • • • + knAun
= fciO + ^20 + • • • + fc„0 =
Accordingly, kiUi + • • • + k„iia is a solution of the homogeneous system AX = 0.
Theorem 3.5: Suppose the field K is infinite (e.g. if K is the real field R or the complex
field C). Then the system AX = B has no solution, a unique solution or
an infinite number of solutions.
Proof. It suffices to show that if AX = B has more than one solution, then it has
infinitely many. Suppose u and v are distinct solutions of AX = B; that is, Au = B and
Av — B. Then, for any k GK,
A{u + k{uv)) = Au + k{AuAv) = B + k(BB) = B
In other words, for each k e K, u + k(uv) is a. solution of AX = B. Since all such solu
tions are distinct (Problem 3,31), AX = B has an infinite number of solutions as
claimed.
ECHELON MATRICES
A matrix A = (an) is an echelon matrix, or is said to be in echelon form, if the number
of zeros preceding the first nonzero entry of a row increases row by row until only zero
rows remain; that is, if there exist nonzero entries
aiii, '^^h' • •' "'^'r' where ^i < ^2 < • • • < jr
with the property that
aij = for i^r, j < ji, and for i>r
We call ttijj, . . . , ttrj,. the distinguished elements of the echelon matrix A.
Example 3.9: The following are echelon matrices where the distinguished elements have been
circled:
/(i) 3 2 4 5 6\
13 2
2
0/
In particular, an echelon matrix is called a row reduced echelon matrix if the dis
tinguished elements are:
(i) the only nonzero entries in their respective columns;
(ii) each equal to 1.
The third matrix above is an example of a row reduced echelon matrix, the other two are
not. Note that the zero matrix 0, for any number of rows or of columns, is also a row
reduced echelon matrix.
ROW EQUIVALENCE AND ELEMENTARY ROW OPERATIONS
A matrix A is said to be row equivalent to a matrix B if B can be obtained from A by a
finite sequence of the following operations called elementary row operations:
42 MATRICES [CHAP. 3
[Et]: Interchange the ith row and the yth row: Rt <^ Rj.
[E2]: Multiply the ith row by a nonzero scalar k: Ri » kR,, fc v^ 0.
[Es]: Replace the ith row by k times the jth row plus the ith row: Ri * kRj + R,.
In actual practice we apply [£^2] and then [£"3] in one step, i.e. the operation
[E]: Replace the ith row by fe' times the jth row plus k (nonzero) times the ith row:
Ri * k'Rj + kRi, k^O.
The reader no doubt recognizes the similarity of the above operations and those used
in solving systems of linear equations. In fact, two systems with row equivalent aug
mented matrices have the same solution set (Problem 3.71). The following algorithm is
also similar to the one used with linear equations (page 20).
Algorithm which row reduces a matrix to echelon form:
Step 1. Suppose the ji column is the first column with a nonzero entry. Inter
change the rows so that this nonzero entry appears in the first row, that is,
so that ttijj ¥ 0.
Step 2. For each i > 1, apply the operation
Ri * —ttij^Ri + aijjiJt
Repeat Steps 1 and 2 with the submatrix formed by all the rows excluding the first.
Continue the process until the matrix is in echelon form.
Remark: The term row reduce shall mean to transform by elementary row operations.
Example 3.10: The following matrix A is row reduced to echelon form by applying the operations
R2 ^ 2Ri + ^2 and i?3 ^ 3fii + R3, and then the operation R3 » SKj + 4^3:
1
2
3
4
2
5
3
1
2
3
4
2
2
A=2 422to0042to
Now suppose A = (oij) is a matrix in echelon form with distinguished elements
aijj, . . . , Orj^. Apply the operations
Rk ^ ak^Ri + OiiRk, fc = 1, . . ., i 1
for i = 2, then i = 3, ...,i = r. Thus A is replaced by an echelon matrix whose dis
tinguished elements are the only nonzero entries in their respective columns. Next, multiply
Ri by a~^, i~r. Thus, in addition, the distinguished elements are each 1. In other words,
the above process row reduces an echelon matrix to one in row reduced echelon form.
Example 3.11: On the following echelon matrix A, apply the operation R^ ^ — 4^2 + 3i2i and then
the operations fii ♦ ^3 + Bi and R^ > — 5K3 + 2i22:
/2 3 4 5 6\ /6 9 7 2\ /6 9 7 0^
A=0 3 2 5toO 3 2 5to0 6 4
\0 2/ \0 2/ \0 2/
Next multiply Ri by 1/6, R2 by 1/6 and ^3 by 1/2 to obtain the row reduced echelon
matrix
/l 3/2 7/6 0\
12/3
\0 1/
The above remarks show that any arbitrary matrix A is row equivalent to at least one
row reduced echelon matrix. In the next chapter we prove, Theorem 4.8, that A is row
equivalent to only one such matrix; we call it the row canonical form of A.
CHAP. 3] MATRICES 43
SQUARE MATRICES
A matrix with the same number of rows as columns is called a square matrix. A square
matrix with n rows and n columns is said to be of order n, and is called an nsquare matrix.
The diagonal (or: main diagonal) of the nsquare matrix A = (Oij) consists of the elements
an, a22, . . . , ftjin.
/l 2 3^
Example 3.12: The following is a 3square matrix: 4 5 6
\7 8 9,
Its diagonal elements are 1, 5 and 9.
An upper triangular matrix or simply a triangular matrix is a square matrix whose
entries below the main diagonal are all zero:
/an
ai2 .
. ain\
/ail
ai2 .
. am
O22
ain
or
a22
. a2n
\0 ... ann/
Similarly, a lower triangular matrix is a square matrix whose entries above the main
diagonal are all zero.
A diagonal matrix is a square matrix whose nondiagonal entries are all zero:
/a, ... \
a2 ...
\o ... a„/
'ai
or ' "^
an
In particular, the nsquare matrix with I's on the diagonal and O's elsewhere, denoted by /«
or simply /, is called the unit or identity matrix; e.g.,
/l 0^
h = 10
\0 1,
This matrix I is similar to the scalar 1 in that, for any nsquare matrix A,
AI = lA = A
The matrix kl, for a scalar k G K, is called a scalar matrix; it is a diagonal matrix whose
diagonal entries are each k.
ALGEBRA OF SQUARE MATRICES
Recall that not every two matrices can be added or multiplied. However, if we only
consider square matrices of some given order n, then this inconvenience disappears. Specif
ically, the operations of addition, multiplication, scalar multiplication, and transpose can be
performed on any nxn matrices and the result is again an n x n matrix.
In particular, if A is any nsquare matrix, we can form powers of A:
A^ = AA, A^ = A^A, . .. and A" = /
We can also form polynomials in the matrix A: for any polynomial
f{x) = ao + ai* + UiX^ + • • • + ttnX"
44 MATRICES [CHAP. 3
where the aj are scalars, we define f(A) to be the matrix
/(A) = aol + aiA + a2A^ + • • • + a„A"
In the case that f{A) is the zero matrix, then A is called a zero or root of the polynomial f{x).
Example 3.13: Let A = (J _l); then A^ = (J J)(J _^2) = (_^ "^
If f(x) = 2a;2  3a; + 5, then
If g{x) = x^ + 3x 10, then
''^' = (J ^:)  Ka I)  < :) ^ c :
Thus A is a zero of the polynomial g(x).
INVERTIBLE MATRICES
A square matrix A is said to be invertible if there exists a matrix B with the property
that
AB = BA = I
where / is the identity matrix. Such a matrix B is unique; for
ABi  BiA = / and AB2 = B2A = I implies Bi = BJ  BiiABz) = iBiA)Bi = IB2 = B2
We call such a matrix B the inverse of A and denote it by A~*. Observe that the above
relation is symmetric; that is, if B is the inverse of A, then A is the inverse of B.
Example 3.14:
2 5\/ 3 5\ _ /65 10 + 10
1 3/1^1 2) ~ 1^33 5 + 6
3 5\/2 5\ _ / 65 1515
1 2j\l s) " V2 + 2 5 + 6
,'2 5\ / 3 5\
Thus ( , „ ) and  ^ „ ) are invertible and are inverses of each other.
1
1
1
1
1 3/"""\^i 2
We show (Problem 3.37) that for square matrices, AB = I if and only if BA = /; hence
it is necessary to test only one product to determine whether two given matrices are in
verses, as in the next example.
11 + + 12 2 + 02 2 + 02
Example 3.15: (21 3)(4 l=22 + 4 + 18 4 + 03 413
444 + 48 8 + 08 8 + 18
Thus the two matrices are invertible and are inverses of each other.
fa b\
We now calculate the inverse of a general 2x2 matrix A — { 1 . We seek scalars
X, y, z, w such that ^ ''
a ^\( ^ y\ _ /l 0\ fax + bz ay + bw\ _ /l
cd)\zwj~\0l) ^^ \cx + dz cy + dwj " \0 1
CHAP. 3] MATRICES 45
which reduces to solving the following two systems of linear equations in two unknowns:
iax + bz = 1 jay + bw =
\cx + d2 = \cy + dw = 1
If we let A = ad — be, then by Problem 2.27, page 33, the above systems have solutions if
and only if \A\ ¥= 0; such solutions are unique and are as follows:
d d _ —b _ ^ _ — c — z£. — "' ^
ad be \A\' " ad be \A\' ad  be \A\' ad be \A\
d/\A\ b/\A\\ _ \ ( d b"
Accordingly, ..  , ,i^i i  i^ii
'^c/A 0'l\A\J \A\\^c a
Remark: The reader no doubt recognizes \A\ = ad — bc as the determinant of the matrix
A; thus we see that a 2 x 2 matrix has an inverse if and only if its determinant
is not zero. This relationship, which holds true in general, will be further
investigated in Chapter 9 on determinants.
BLOCK MATRICES
Using a system of horizontal and vertical lines, we can partition a matrix A into smaller
matrices called bloeks (or: eells) of A. The matrix A is then called a block matrix. Clearly,
a given matrix may be divided into blocks in different ways; for example,
1
2
3
1
5
4
1 3\
72 =
5 9/
1',
2
3
5
1 j 3\
7 2 =
/I
2
"1 '
3
2
2
3
5]7
2
3
\s
1
4
5 1 9/
\3
1
4 1 5
9
The convenience of the partition into blocks is that the result of operations on block matrices
can be obtained by carrying out the computation with the blocks, just as if they were the
actual elements of the matrices. This is illustrated below.
Suppose A is partitioned into blocks; say
Ain
Ain
Multiplying each block by a scalar k, multiplies each element of A by k; thus
(kAii. kAi2 . . . kAm
feAai &A22 . . . kA2n
iCAjnl rCAjn2 . . . iCAmn j
Now suppose a matrix B is partitioned into the same number of blocks as A; say
B =
j Bn B12 . . . Bin
B21 B22 . . . B2n
\^Bml Bm2 ... B,
46 MATRICES [CHAP. 3
Furthermore, suppose the corresponding blocks of A and B have the same size. Adding
these corresponding blocks, adds the corresponding elements of A and B. Accordingly,
/All + fill Ai2 + Bi2 ... Am + Em
A I I> — I ^21 + ^21 A22 + B22 . . . Aln + Bin
\Am\ + Bm\ Am2 + Bm2 • • ■ Amn + Bm
The case of matrix multiplication is less obvious but still true. That is, suppose matrices
U and V are partitioned into blocks as follows
U12 ... C/ip\ /Fa 7i2 ... Vin\
JJ ^ I Zl C/22 ... U2P ^^^ y ^ I V2I V22 ... V2:
Vmi ... Umpj \Vj,l F22 ... Fpn/
such that the number of columns of each block Uik is equal to the number of rows of each
block Vkj. Then
'Wn Wn ... Wm
Wn W22 ... Wzn
where Wa = UnVn + Ui2V2i + • ■ • + UipVpj
The proof of the above formula for UV is straightforward, but detailed and lengthy. It
is left as a supplementary problem (Problem 3.68).
Solved Problems
MATRIX ADDITION AND SCALAR MULTIPLICATION
3.1.
Compute:
{I
2 3 4^
5 1 1
\ /3 5
J I2 
6 1\
2 3y
^^ [1 
1 1) ^
/3 5\
(iii)
{I
2
5
3
6
(i) Add corresponding entries:
n 2
\Q 5
I 0  (I
5 6 
2 
^)
/I + 3
25
3 + 6
4
 l^
(0 + 2
5 +
12
1 
3y
4333
2 5 1 4
(ii) The sum is not defined since the matrices have different shapes.
(iii) Multiply each entry in the matrix by the scalar —3:
0/1 2 3N _ / 3 6 9
^ ' '   '  ' 12 1518
CHAP. 3] MATRICES 47
«« T . . 72 5 1\ „ /I 2 3\ /O 1 2\
3^. Let A = (3 04)'^ = (01 5J'^ = (llljF^"<i 3A + 4B2C.
First perform the scalar multiplication, and then the matrix addition:
„^ , .„ „^ /6 15 3\ /4 8 12\ / 2 4\ /lO 25 5\
3A + 4B  2C = (^ 12) + (0 4 2o} + (2 2 2) = ( 7 2 lo)
„ fxy\ / X 6 \ / 4 X + y
3.3. Fmda;,i/,zandwif 3 = +
^ z w I \ —1 2w \z + w 3
First write each side as a single matrix:
/3a; 3y\ _ / x + 4
\3z BwJ ~ \z + wl
X + y + 6
2w + 3
Set corresponding entries equal to each other to obtain the system of four equations,
3as = « + 4 2a; = 4
3y = X + y + 6 2y = 6 + x
or
3z = z + w — 1 2z = w — 1
3w = 2w + 3 w = 3
The solution is: x = 2, j/ = 4, z = 1, w = 3.
3.4. Prove Theorem 3.1(v): Let A and B be mxn matrices and k a scalar. Then
kiA+B) = kA + kB.
Suppose A — (Ojj) and B — (bij). Then Oy + 6jj is the yentry of A + B, and so &(ajj + 6^)
is the ventry of k(A +B). On the other hand, ka^j and fcfty are the ijentries of kA and kB respec
tively and so fcay + fe6y is the tientry of kA + kB. But k, ay and &„• are scalars in a field; hence
k(aij + 6jj) = fcfflij + kbij, for every i, j
Thus k(A + B) = kA + kB, as corresponding entries are equal.
Remark: Observe the similarity of this proof and the proof of Theorem l.l(v) in Problem 1.6, page
7. In fact, all other sections in the above theorem are proven in the same way as the
corresponding sections of Theorem 1.1.
MATRIX MULTIPLICATION
3.5. Let (r x s) denote a matrix with shape rxs. Find the shape of the following products
if the product is defined:
(i) (2x3)(3x4) (iii) (1 x 2)(3 x 1) (v) (3 x 4)(3 x 4)
(ii) (4xl)(lx2) (iv) (5 x 2)(2 x 3) (vi) (2 x 2)(2 x 4)
Recall that an m X p matrix and a qXn matrix are multipliable only when p = q, and then
the product is an m X n matrix. Thus each of the above products is defined if the "inner" numbers
are equal, and then the product will have the shape of the "outer" numbers in the given order.
(i) The product
(ii) The product
is a 2 X 4 matrix,
is a 4 X 2 matrix.
(iv) The product
(v) The product
(iii) The product is not defined since the inner numbers 2 and 3 are not equal.
is a 5 X 3 matrix.
is not defined even though the matrices have the same shape.
(vi) The product is a 2 X 4 matrix,
48 MATRICES [CHAP. 3
3.6. Let ^ = (2 _!) and ^ ^ ( 3 2 6 ) " ^^^^ ^^^ ^^' ^"^ ^^'
(i) Since A is 2 X 2 and B is 2 X 3, the product AB is defined and is a 2 X 3 matrix. To obtain the
/2\ /
entries in the first row of AB, multiply the first row (1, 3) of A by the columns
_4S x3y'V2
and (  1 of B, respectively:
1 S\/2 04\ _ /12 + 33 10 + 3 (2) 1 • (4) + 3 • 6
2 1 j(^ 3 2 6 y ~ V
2 + 9 06 4 + 18\ /ll 6 14
To obtain the entries in the second row of AB, multiply the second row (2, —1) of A by the
columns of B, respectively:
1 3 y 2 4 N ^ / 11 6 14 \
2 1 A 3 2 ey ~ V22 + (l)3 2 • + (1) • (2) 2 • (4) + (1) • 6/
Thus ^^ = ( 1 214
(ii) Note that J5 is 2 X 3 and A is 2 X 2. Since the inner numbers 3 and 2 are not equal, the product
BA is not defined.
3.7. Given A = (2,1) and B = /^ 53)' *"^ ^^^ ^^' ^"^ ^^*
(i) Since A is 1 X 2 and B is 2 X 3, the product AB is defined and is a 1 X 3 matrix, i.e. a row
vector with 3 components. To obtain the components of AB, multiply the row of A by each
column of B:
AB = (%,!)( \ ■"! ® ) = (2 • 1 + 1 . 4, 2 • (2) + 1 • 5, 2 • + 1 • (3)) = (6, 1, 3)
\ 4 5 — 8 /
(ii) Note that B is 2 X 3 and A is 1 X 2. Since the inner numbers 3 and 1 are not equal, the product
BA is not defined.
3.8. Given A = ( 1 and B = [^ ^ ^V find (i) AB, (ii) BA.
(i) Since A is 3 X 2 and B is 2 X 3, the product AB is defined and is a 3 X 3 matrix. To obtain the
first row of AB, multiply the first row of A by each column of B, respectively:
2 1 \ , /23 44 10 + 0\ 11 8 10^
\ " ( 3 1 "o ) =
3
To obtain the second row of AB, multiply the second row of A by each column of B,
respectively:
/ 2 1 \ , / 1 8 10 \ /I 8 10^
125
3 4
\ 6 i /
To obtain the third row of AB, multiply the third row of A by each column of B, respectively:
1 1( ^ ^ ^ ] = I 1 + 2 + 5 + I = 11 2 5
CHAP. 3]
MATRICES
49
"(s ::) =
1 8 10
1 2 5
3 + 12 6 + 16 15 + i
1
8
10
1
2
5
9
22
15
Thus
AB
(ii) Since 5 is 2 X 3 and i4 is 3 X 2, the product BA is defined and is a 2 X 2 matrix. To obtain the
first row of BA, multiply the first row of B by each column of A, respectively:
22 + 15 1 + 020
15 21
To obtain the second row of BA, multiply the second row of B by each column of A, respectively:
)l ' "«
'15 21 >
.10
2 5
4
15 21
6 + 4 + 3 + +
/15 21
(lO 3
Thus
BA
"1)
Remark: Observe that in this case both AB and BA are defined, but they are not equal; in fact they
do not even have the same shape.
3.9. Let A =
/I
4
1
2
1
o\
and
B =
2
1
3
1
1
\4
2
0,
(i) Determine the ahaite of AB. (ii) Let Ca denote the element in the ith row and
:;th column of the product matrix AJB, that is, AB = (co). Find: c^a, Cu and C21.
(i) Since A is 2 X 3 and £ is 3 X 4, the product AB is a 2 X 4 matrix.
(ii) Now Cy is defined as the product of the ith row of A by the ith column of B. Hence:
= 1 • + • 3 + (3) • (2) = + + 6 = 6
c,4  (2,1,0)
21 + (l)'(l) + 00 = 2 + 1 + =
C21 = (1, 0, 3)
= 1 • 1 + • 2 + (3) 4 = 1 + 012 = 11
3.10. Compute:
(i)
(ii)
1 6\/4
3 5/(2 1
1 6
3 5
2
7
(iii)
(iv)
^(l
(3,2)
(V) (2,1)
6
(i) The first factor is 2 X 2 and the second is 2 X 2, so the product is defined and is a 2 X 2 matrix:
50 MATRICES
[CHAP. 3
1 6Y4 0\ ^ / l'4 + 62 l0 + 6'(l) \ _ / 16 6'
^3 5A2 1/ V(3)4 + 5.2 (3)0 + 5(l)y ~ [2 5^
(ii) The first factor is 2 X 2 and the second is 2 X 1, so the product is defined and is a 2 X 1 matrix:
1 ^V 2\ _ / 12 + 6 (7) \ _ /40'
3 5A7; \(3)'2 + 5'(7)) ^ [41^
(iii) Now the first factor is 2 X 1 and the second is 2X2. Since the inner numbers 1 and 2 are
distinct, the product is not defined.
(iv) Here the first factor is 2 X 1 and the second is 1 X 2, so the product is defined and is a 2 X 2
matrix:
3 2^
{>'' = ill i:i) =
18 12
(v) The first factor is 1 X 2 and the second is 2 X 1, so the product is defined and is a 1 X 1 matrix
which we frequently write as a scalar.
(2,l)(^_g) = (21 + (1). (6)) = (8) = 8
3.11. Prove Theorem 3.2(i): (AB)C = A{BC).
Let A = (oy), B = (bfl,) and C = (e^). Furthermore, let AB = S = (sj^) and BC = T = (t,,).
Then
Sjfc = ajiftifc + at2b2k + • • • + at„6mfc = 2 Oyftjj.
3=1
n
hi = ^ji'^ii + bjiCn + • • • + bj„c„i = 2 fcjfcCfci
lc=l
Now multiplying S by C, i.e. (AB) by C, the element in the ith row and Ith column of the matrix
{AB)C is
n m
»ilCll + SJ2C21 + • • • + Si„C„i = 2 StfcCfcl =22 {"'ifijkiOkl
k=l fc=l j=l
On the other hand, multiplying A by T, i.e. A by BC, the element in the tth row and fth column
of the matrix A{BC) is
tn m n
»il*ll + «i2*2! + • • • + aim*ml = 2 »ij*jl =22 ««(6jfcCfci)
Since the above sums are equal, the theorem is proven.
3.12. Prove Theorem 3.2(ii): A{B + C) = AB + AC.
Let A = (tty), B = (6jfc) and C = (Cj^). Furthermore, let D = B + C= (dj^), E = AB = (ej^)
and F = AC = (fik) Then
djk = 6jfc + Cjfc
m
e* = aii6ik + ai2*'2fc + • ■ • + ajm^mk  2 «ij6jic
j=i
m
/ifc = Ojl^lfc + <»i2<'2fc + ■ • • + "■vm^mk — 2 "ijCjfc
3 = 1
Hence the element in the ith row and feth column of the matrix AB + AC is
m tn m
«ik + fik  2 ayftjfc + 2 ttyCjfc = 2 «i)(6jic + c^k)
i=l 3=1 j=l
On the other hand, the element in the ith row and fcth column of the matrix AD = A(B + C) is
m m
Ojidifc + aisd^k + ••• + otmdmk = 2 oadjk = 2 a.ij(bjk + Cjk)
}=l i=l
Thus A{B + C) — AB + AC since the corresponding elements are equal.
CHAP. 3] MATRICES 51
TRANSPOSE
3.13. Find the transpose A* of the matrix A =
Rewrite the rows of A as the columns of A':
3.14. Let A be an arbitrary matrix. Under what conditions is the product AA* defined?
Suppose A is an m X n matrix; then A* is n X m. Thus the product AA* is always defined.
Observe that A*A is also defined. Here AA* is an w X m matrix, whereas A*A is an n X n matrix.
3.15. Let ^ = ( 3 _! 4) • Find (i) AA\ (ii) A*A.
'1 31
To obtain A*, rewrite the rows of A as columns: A* = 2 — 1  . Then
/I 3 ^^
  G ' :)(: :
l'l + 2'2 + 0'0 13 + 2'(l) + 0'4 \ _ /5 1
3'1 + (1)'2 + 40 3'3 + (1) •(!) + 44/ ~ \1 26
A*A
 I i (3 I :)
11 + 33 l'2 + 3'(l) l'0 + 34 \
2 • 1 + (1) '3 2 • 2 + (1) • (1) 2 • + (1) • 4
0«l + 4'3 02 + 4' (1) • + 4 • 4 j
3.16. Prove Theorem 8.3(iv): {AB)* = B*AK
Let A = (oy) and B = (bj^). Then the element in the ith row and jth column of the matrix
AB is
anbij + ai^hzj + • • • + ai^h^j (1)
Thus (1) is the element which appears in the jth row and ith column of the transpose matrix (AB)*.
On the other hand, the jth row of B* consists of the elements from the jth column of B:
(6„ bzj ... 6„j) (2)
Furthermore, the tth column of A* consists of the elements from the ith row of A:
(3)
Consequently, the element appearing in the ;th row and ith column of the matrix B*A* is the
product of (2) by (S) which gives (1). Thus (AB)* = B*A*.
52 MATRICES [CHAP. 3
ECHELON MATRICES AND ELEMENTARY ROW OPERATIONS
3.17. Circle the distinguished elements in each of the following echelon matrices. Which
are row reduced echelon matrices?
/l 2 3 l\ /O 1 7 5 0\
5 24, 1,
\0 7 3/ \0 0/
The distinguished elements are the first nonzero entries in the rows; hence
[l)23 0l\ /0©75 0\ /0O5O2^
0(1)2 4, 00,00 20 4
3/ \0 0/ \0 7,
An echelon matrix is row reduced if its distinguished elements are each 1 and are the only nonzero
entries in their respective columns. Thus the second and third matrices are row reduced, but the
first is not.
/I 2 3 1\
3.18. Given A = 2 1 2 2 . (i) Reduce A to echelon form, (ii) Reduce A to row
\3 1 2 3/
canonical form, i.e. to row reduced echelon form.
(i) Apply the operations ^2 * 2i?i + Rz and R^ ^ ZRi + ^3. and then the operation
iJj > 7B2 + 3B3 to reduce A to echelon form:
'1 2 3 1^
Ato0 34 4to0 34 4
^0 7 lOy
ii) Method 1. Apply the operation B, » 2i?2 + ZRi, and then the operations iSj ^ ^3 + 7i?i
and Ri ^ 4/?3 + 7i?2 to the last matrix in (i) to further reduce A:
IZX 45^
to I 34 4 I to I 21 012
7 10 J
Finally, multiply Bj by 1/21, R^ by 1/21 and fig by 1/7 to obtain the row canonical form of A:
'1 15/7^
10 4/7
^0 1 10/7 J
Method 2. In the last matrix in (i), multiply R^ by 1/3 and fig by 1/7 to obtain an echelon
matrix where the distinguished elements are each 1:
'12 3 1
1 4/3 4/3
^0 1 10/7/
Now apply the operation R^ » 2^2 + Ru and then the operations R2, » (4/3)fi3 + R^ and
jBj ^ {—1/3)R3 + jBi to obtain the above row canonical form of A.
Remark: Observe that one advantage of the first method is that fractions did not appear
until the very last step.
CHAP. 3] MATRICES 53
/O 1 3 2\
3.19. Determine the row canonical form of A = 2 14 3
\2 3 2 ll
1^ 1 4 Z\ I2 1
A to
Note that the third matrix is already in echelon form.
/ 6 3 4\
3.20. Reduce A = 4 1 6 to echelon form, and then to row reduced echelon form,
\ 1 25/
i.e. to its row canonical form.
The computations are usually simpler if the "pivotal" element is 1. Hence first interchange the
first and third rows:
A to
Note that the third matrix is already in echelon form.
3.21. Show that each of the following elementary row operations has an inverse operation
of the same type.
m
Interchange the ith row and the jth. row: Ri <^ Rj.
Multiply the zth row by a nonzero scalar k: Ri » kRi, fc ^ 0.
Replace the ith row by k times the jth row plus the ith row: Ri ^ kRj + Ru
(i) Interchanging the same two rows twice, we obtain the original matrix; that is, this operation
is its own inverse.
(ii) Multiplying the ith row by k and then by fci, or by fci and then by k, we obtain the original
matrix. In other words, the operations iJj » kRi and i?j ^ feiiJj are inverses.
(iii) Applying the operation flj » kRj + Ri and then the operation fij ^ —kRj + fij, or apply
ing the operation fij * kRj + i?j and then the operation fij » kRj + flj, we obtain the orig
inal matrix. In other words, the operations Ri » kRj + fij and iJj » —kRj + flj are
inverses.
SQUARE MATRICES
3.22. Let A = ( ^ _g ) . Find (i) A^ (ii) A*, (iii) /(A), where fix) = 2a^  4x + 5.
1 2
4 3
^4 3^
(i) A^ = AA =
)il 4)
/ 11 + 24 12 + 2 (3) \ ^ / 9 4\
V4l + (3)4 42 + (3) (3)/ ~ [8 17/
54 MATRICES [CHAP. 3
«  = ' = c :)(: »)
/ l9 + 2(8) l(4) + 217 \ ^ /7 30\
(^49 + (3) (8) 4 (4) + (3) 17/ \eO 67 J
(iii) To find /(A), first substitute A for x and 57 for the constant 5 in the given polynomial
f(x) = 2x9 _ 4a; + 5:
/(A) = 2A34A + 5/ = 2(; Z)  ^ {\ s) + K'o l)
Then multiply each matrix by its respective scalar:
/14 60\ / 4 8\ /5 0\
1^120 134y V16 12/ \0 5y
Lastly, add the corresponding elements in the matrices:
_ / 14 4 + 5 608 + \ _ I' IS
" 1^12016 + 134 + 12 + 5/
104
52 \
117/
3.23. Referring to Problem 3.22, show that A is a zero of the polynomial g{x) = a;^ + 2a!  11.
A is a zero of g(x) if the matrix g(A) is the zero matrix. Compute g(A) as was done for /(A),
i.e. first substitute A for x and 11/ for the constant 11 in g(x) = x^ + ^x 11:
„,., = ..... n, = (:.)(! 4) "GO
Then multiply each matrix by the scalar preceding it:
/ 9 _4x /2 4\ , /ll
g{A)
^ / 9 4X /2 <
V8 17; V8 (
6/ V 11
Lastly, add the corresponding elements in the matrices:
/9 + 211 4 + 4 + 0\ ^ /O
g{A)  ^_g^8^.Q i7_g_;^iy (^0
Since g{A) = 0, A is a zero of the polynomial g(x).
3.24. Given A  [ ) . Find a nonzero column vector u = I ) such that Au  3m.
First set up the matrix equation Au = 3u:
4 3/U/ ~ ^\y
Write each side as a single matrix (column vector):
/ x + 3y\ ^ /Sx^
\Ax3yJ V3j/y
Set corresponding elements equal to each other to obtain the system of equations (and reduce to
echelon form):
a; + 3J, = 3a; __ 2x  3y = _ 2x  Sy = ^^ 2x  Sy =
AxZy  Zy Ax 6y = 00
The system reduces to one homogeneous equation in two unknowns, and so has an infinite number
of solutions. To obtain a nonzero solution let, say, ?/ = 2; then » = 3. That is, a; = 3, i/ = 2 is a
solution of the system. Thus the vector w = ( g j is nonzero and has the property that Au = 3m.
CHAP. 3] MATRICES 55
/3 5^
3.25. Find the inverse of f ^ „
l2 3
Method 1. We seek scalars x, y, z and w for which
,2 3/\2 w/ " \0 l) *"" \2x + Sz 2y + 3w)
,.,..» r3a; + 5« = 1 [31/ + 5w =
or which satisfy < and ■(
l2a; + 3« = \2j/ + 3w = 1
GO
The solution of the first system is a; = —3, z — 2, and of the second system is 2/ = 5, w = —3.
/3 5\
Thus the inverse of the given matrix is 1 ) .
Method 2. We derived the general formula for the inverse A^ oi the 2X2 matrix A
Ttaslf A=Q ly ,h.„ A = 91(1 = 1 .„d A. = l(4 ^) = (J 4).
A^i = rTT ( 1 where lAI = ad — 6c
1^1 Vc a'
MISCELLANEOUS PROBLEMS
3.26, Compute AB using block multiplication, where
2 I 1\
3 4 I 1 and B =
\0 I 2,
Hence
„ ^1 and S = ( ) where E, F, G, R, S and T are the given blocks.
GJ \0 TJ
ER ES + FT\
GT J ~
//9 12 15N /3N /I
Vl9 26 33/ V^/ \0
\ ( 0) (2)
AB = (^^^ ^^^+/^) = jVl9 26 33y V^yVoyj = ji9 26 33 7
3.27. Suppose B = {Ri, R2, . . ., i?„), i.e. that Ri is the ith row of B. Suppose BA is de
fined. Show that BA = (RiA, RzA, . . .,RnA), i.e. that RiA is the ith row of BA.
Let i4i, A2, . . ., A"» denote the columns of A. By definition of matrix multiplication, the ith row
of BA is {Ri •A\Ri'A\ • . . , i2i • A"). But by matrix multiplication, BjA = (Bj • A^, i?i • A2, . . . ,
BjA™). Thus the ith row of BA is ftjA.
3.28. Let Ci = (0, . . . , 1, . . . , 0) be the row vector with 1 in the tth position and else
where. Show that CjA = Ri, the ith row of A.
Observe that Cj is the ith row of /, the identity matrix. By the preceding problem, the tth row
of I A is BjA. But lA = A. Accordingly, CjA = JBj, the ith row of A.
3.29. Show: (i) If A has a zero row, then AB has a zero row.
(ii) If B has a zero column, then AB has a zero column.
(iii) Any matrix with a zero row or a zero column is not invertible.
(i) Let jBj be the zero row of A, and B^, . . .,£" the columns of B. Then the ith row of AB is
{RrB\ Ri'B^ ..., RiB^) = (0, 0, ..., 0)
56 MATRICES [CHAP. 3
(ii) Let Cj be the zero column of B, and Aj, . . ., A„ the rows of A. Then the jth column of AB is
/AiC/
A^'Cj
m'Cj
(iii) A matrix A is invertible means that there exists a matrix A~^ such that AA"^ = A~^A —I.
But the identity matrix / has no zero row or zero column; hence by (i) and (ii) A cannot have
a zero row or a zero column. In other words, a matrix with a zero row or a zero column cannot
be invertible.
3.30. Let A and B be invertible matrices (of the same order). Show that the product
AB is also invertible and (AB)^ = B'^A'K Thus by induction, (AiA2 • An^^ =
An"* • • Az^Ai^ where the Ai are invertible.
(AB)(BiAi) = A(BBi)Ai = A/Ai = AA i = /
and (BiAi)(AB) = Bi(AiA)B = B^B = B^B = I
Thus (AB)i = BiAi.
3.31. Let u and v be distinct vectors. Show that, for each scalar kGK, the vectors
u + k{u — v) are distinct.
It suffices to show that if
u + ki{u — v) = M + k2(u — v) (1)
then fcj = k^. Suppose (1) holds. Then
ki(u — v) = k^iu — v) or {ki — k2)(u — v) =
Since u and v are distinct, u — v¥'0. Hence fci — fcg — and fci = /Cj.
ELEMENTARY MATRICES AND APPLICATIONS*
3.32. A matrix obtained from the identity matrix by a single elementary row operation is
called an elementary matrix. Determine the 8square elementary matrices corre
sponding to the operations Ri <^ R2, Ra ^ —IRs and 722 * — 3i?i + R2.
/I o\
Apply the operations to the identity matrix /g = 1 to obtain
\o 1/
£^1 = 1 , ^2 =
/I
1
\o
7
Eo =
3.33. Prove: Let e be an elementary row operation and E the corresponding msquare elemen
tary matrix, i.e. Ee(lm). Then for any TO X % matrix A, e{A) = EA. That is, the re
sult e(A) of applying the operation e on the matrix A can be obtained by multiplying
A by the corresponding elementary matrix E.
Let iJj be the tth row of A; we denote this by writing A = (B^ R^). By Problem 3.27, if
B is a matrix for which AB is defined, then AB = (R^B, ..., R^B). We also let
ej = (0, ...,0,1,0 0), A = i
*This section is rather detailed and may be omitted in a first reading. It is not needed except for certain
results in Chapter 9 on determinants.
CHAP. 3] MATRICES 57
Here a = t means that 1 is the ith component. By Problem 3.28, e^A = iJj. We also remark that
/ = (cj, . . ., e„) is the identity matrix.
(i) Let e be the elementary row operation i^j «> Rj. Then, for a = i and A — j,
E  e(I) = (ej Bj ej, . . ., ej
and e(A) = (iBj, . . .,£^, . . ., Rt BJ
Thus /\ ^ . /s A
^A = (fijA, ...,e^, ...,6iA, ...,e^A) = (fii, . . ., i?,, . . ., ffj fij = e(A)
(ii) Now let e be the elementary row operation jB^ > fcjBj, fc t^ 0. Then, for a = i,
E = e(/) = (6i, ...,fcej, ..., ej and e(A) = (ftj, . . ., fcfij, . . ., BJ
/\ /\
Thus ^A = (fijA, ...,A;ejA, ...,e^A) = (fij, . . . , fefij, . . . , «„) = e(A)
(iii) Lastly, let e be the elementary row operation JBj ^ kRj + Kj. Then, for /\ — i,
E = e(I) = (ei, ...,fcej + ej, ...,6j and e(A) =: (fij, . . . , fcfij + Bj, . . . , i2 J
Using (ftej + ej)A = fc(ej.A) + BjA  kRj + Rf, we have
EA = (M, ...,(fce^ + ei)A, ...,e„A) = (R^, . . ., kRj + Ri, . . ., RJ = e(A)
Thus we have proven the theorem.
3^. Show that A is row equivalent to B if and only if there exist elementary matrices
El, ...,Es such that Es ■ • E2E1A = B.
By definition, A is row equivalent to B if there exist elementary row operations ej, . . ..e, for
which es((e2(ei(A))) ••) = B. But, by the preceding problem, the above holds if and only if
Eg • E^EiA = B where JS7j is the elementary matrix corresponding to e^.
3^5. Show that the elementary matrices are invertible and that their inverses are also
elementary matrices.
Let E be the elementary matrix corresponding to the elementary row operation e: e(I) = E.
Let e' be the inverse operation of e (see Problem 3.21) and E' its corresponding elementary matrix.
Then, by Problem 3.33,
/ = e'(e(/)) = e'E = E'E and / = e(e'(I)) = eE' = EE'
Therefore E' is the inverse of E.
3M. Prove that the following are equivalent:
(i) A is invertible.
(ii) A is row equivalent to the identity matrix /.
(iii) A is a product of elementary matrices.
Suppose A is invertible and suppose A is row equivalent to the row reduced echelon matrix B.
Then there exist elementary matrices Ei,E2, ■ ■ yE^ such that Eg • ■E2E1A = B. Since A is invert
ible and each elementary matrix E^ is invertible, the product is invertible. But if B ^ I, then B
has a zero row (Problem 3.47); hence B is not invertible (Problem 3.29). Thus B = I. In other
words, (i) implies (ii).
Now if (ii) holds, then there exist elementary matrices E^, E^, . . .,Eg such that
E.'E^EiA = /, and so A = (E, ■ E^Ei)^ = E^^E^^'EJ^
By the preceding problem, the Ei are also elementary matrices. Thus (ii) implies (iii).
Now if (iii) holds (A = EiE^ ■ E^), then (i) must follow since the product of invertible
matrices is invertible.
58 MATRICES [CHAP. 3
3.37. Let A and B be square matrices of the same order. Show that if AB = I, then
B = AK Thus AB = I if and only if BA = I.
Suppose A is not invertible. Then A is not row equivalent to the identity matrix /, and so A
is row equivalent to a matrix with a zero row. In other words, there exist elementary matrices
El E^ such that E^ • E^E^A has a zero row. Hence E^ ■ EJE^AB has a zero row. Accordingly,
AB is row equivalent to a matrix with a zero row and so is not row equivalent to /. But this con
tradicts the fact that AB = /. Thus A is invertible. Consequently,
B = IB = (A'>^A)B = AHAB) = A'^I = A"!
1.38.
Suppose A is invertible and, say, it is row reducible to the identity matrix / by the
sequence of elementary operations ei, . . ., e„. (i) Show that this sequence of elemen
tary row operations applied to / yields AK (ii) Use this result to obtain the inverse
/I 2\
of A = 2 1 3
\4 1 8/
(i) Let Ei be the elementary matrix corresponding to the operation ej. Then, by hypothesis and
Problem 3.34, E„ • E^EiA = I. Thus (E„ ■ ■EiEJ)A ^ I and hence A i = E^EJEJ
In other words, A i can be obtained from / by applying the elementary row operations ej,
(ii) Form the block matrix (A, I) and row reduce it to row canonical form:
2 I 1 0\ /l 2
(A, /) = [21 3 I 1 I to I 1
1 8 I
e«
to
2 I 1
1 I 2 1
1 I 6 1
to
Observe that the final block matrix is in the form (/, B). Hence A is invertible and B is its
inverse:
Ai =
Remark: In case the final block matrix is not of the form (/, B), then the given matrix is not
row equivalent to I and so is not invertible.
Supplementary Problems
MATRIX OPERATIONS
In Problems 3.393.41, let
(o1 !)• "il
4 03
2 3
3.39. Find: (i) A + B, (ii) A + C, (iii) 3A  4B.
3.40. Find: (i) AB, (ii) AC, (iii) AZ), (W) BC, (y) BD, (wi) CD.
3.41. Find: (i) A*, (ii) A'C, (iii) IJtA', (iv) B«A, (y) DW. {wi) DDK
CHAP. 3] MATRICES 59
61 &2 h h], find (i) ejA,
C, C2 C3 C4/
3.43. Let Cj = (0, . ... 0, 1, 0, .... 0) where 1 is the ith component. Show the following:
(i) Be*. = Cj, the ith column of B. (By Problem 3.28, ejA = Bj.)
(ii) If e^A = ejB for each i, then A = B.
(iii) If Ae\ = Be\ for each i, then A = B.
ECHELON MATRICES AND ELEMENTARY ROW OPERATIONS
3.44. Reduce A to echelon form and then to its row canonical form, where
/l 21 2 l\ /2 32 5 1^
(i) A = 2 4 12 3 , (ii) A = 3 1 2 4
\3 6 26 5/ \4 5 6 5 Ij
3.45. Reduce A to echelon form and then to its row canonical form, where
/l 3 1 2\ /O 1 3 2^
(i) A
11 5 3 ,..^ ^ 413
25 3 1 ■ (") ^=0021
\4 1 1 5/ \0 5 3 4,
3.46. Describe all the possible 2X2 matrices which are in row reduced echelon form.
3.47. Suppose A is a square row reduced echelon matrix. Show that if A # 7, the identity matrix, then
A has a zero row.
3.48. Show that every square echelon matrix is upper triangular, but not vice versa.
3.49. Show that row equivalence is an equivalence relation:
(i) A is row equivalent to A;
(ii) A row equivalent to B implies B row equivalent to A;
(iii) A row equivalent to B and B row equivalent to C Implies A row equivalent to C.
SQUARE MATRICES
3.50. "Let A = ( \ . (i) Find A^ and A3, (ii) If /(«) = vfl  Zx^  2x + i, find /(A), (iii) If
g(x) = a;2  a;  8, find g(A).
3.51. Let B = f ]. (i)U f(x) = 2x2  4x + Z, find f{B). (ii) If g{x) = x^  4x  12, find g(B).
(iii) Find a nonzero column vector u = ( ] such that Bu = 6m.
3.52. Matrices A and B are said to commute if AB = BA. Find all matrices ( ^ ) which commute
. , /I 1\ Vz w/
with (
VO 1
3.53. Let A = ( ) . Find A".
<ll)
60 MATRICES [CHAP. 3
3.54. Let A = ( „ „ and B = ( „ ,,
\0 3/ \0 11
Find: (i) A + B, (ii) AB, (iii) A^ and A3, (iv) A", (v) /(A) for a polynomial f{x).
3.56. Suppose the 2square matrix B commutes with every 2square matrix A, i.e. AB = BA. Show that
B = ( , ) for some scalar k, i.e. B is a scalar matrix.
\0 kj
3.57. Let Dfc be the msquare scalar matrix with diagonal elements k. Show that:
(i) for any mXn matrix A, D^A = kA; (ii) for any nXm matrix B, BD^ = kB.
3.58. Show that the sum, product and scalar multiple of:
(i) upper triangular matrices is upper triangular;
(ii) lower triangular matrices is lower triangular;
(iii) diagonal matrices is diagonal;
(iv) scalar matrices is scalar.
INVERTIBLE MATRICES
 ' '2 3\
3.59. Find the inverse of each matrix: (i) ( ^ c ) > (") ( i
3/
1 2 3\ 2 1 V
3.60. Find the inverse of each matrix: (i)  2 1 , (ii) 2 1
,42 5/ \5 2 3/
1 3 4\
t.61. Find the inverse of  3 1 6
ll 5 1/
3.62. Show that the operations of inverse and transpose commute; that is, (A«)i = (Ai)«. Thus, m
particular, A is invertible if and only if A* is invertible.
(»! ...
°. ."! ///. . .M invertible, and what is its inverse?
3.64. Show that A is row equivalent to B if and only if there exists an invertible matrix P such that
B = PA.
3.65. Show that A is invertible if and only if the system AX = has only the zero solution.
MISCELLANEOUS PROBLEMS
3.66. Prove Theorem 3.2: (iii) (B + QA = BA + CA; (iv) k(AB) = (kA)B = A(kB), where fc is a scalar.
(Parts (i) and (ii) were proven in Problem 3.11 and 3.12.)
3.67. Prove Theorem 3.3: (i) (A + B)* = A* + BH (ii) (A')« = A; (iii) (feA)' = kA*, for k a scalar.
(Part (iv) was proven in Problem 3.16.)
3.68. Suppose A = (A^) and B = (B^,) are block matrices for which AB is defined and the number of
columns of each block Aj^ is equal to the number of rows of each block B^j. Show that AB  (Gy)
where Cy = 2 Ag^B^j.
CHAP. 3] MATRICES 61
3.69. The following operations are called elementary column operations:
[El]
[^3]
Interchange the tth column and the jth column.
Multiply the tth column by a nonzero scalar A;.
Replace the ith column by k times the jth column plus the ith column.
Show that each of the operations has an inverse operation of the same type.
3.70. A matrix A is said to be equivalent to a matrix BUB can be obtained from A by a finite sequence
of operations, each being an elementary row or column operation. Show that matrix equivalence is
an equivalence relation.
3.71. Show that two consistent systems of linear equations have the same solution set if and only if their
augmented matrices are row equivalent. (We assume that zero rows are added so that both aug
mented matrices have the same number of rows.)
Answers to Supplementary Problems
3.39.
« il 1 1)
(ii)
Not defined. (iii) f
13
4
3 18 \
17 0/
3.40.
(i) Not defined.
<"« C)
<^' CI
/5 2 4
(") ( 11 3 12
«')
<"> m ":
8
1)
(vi) Not
1 0\ / 4 7 4\ / 4 2
3.41. (i) I 1 3 I (ii) Not defined. (iii) (9, 9) (iv) 068 (v) 14 (vi) 2 13
2 4/ \3 12 6/ \ 6 3 9y
3.42. (i) (ai, ag. 03. a*) (ii) (61, 62. K K) (iii) (Ci. Cg, Cg, C4)
'1 2 4/3^
3.44. (i) ( 36 1  and ( 1
^0 1 1/6^
/2 32 5 l\ /l 4/11 5/11 13/11 \
(ii) 11 10 15 5 and 1 10/11 15/11 5/11
\0 0/ \0 /
3.45. (i) r''° " and
/I
3
1
2
11
5
3
^0
0/
/o
1
3
2
13
11
35
0/
(ii) L : and
ll
4/11
13/11
1
5/11
3/11
1
o\
1
1
0/
62
MATRICES [CHAP. 3
'■''■ Co o)'(2 D'Co I) ''Co J) ^^'■^ '^ '^ ^"^ «'=^'^"
^0 1 V
3.48. ( 1 1 ) Is upper triangular but not an echelon matrix.
iO 1/
3.52. Only matrices of the form ( ] commute with (^ ^J ■
3.53.
/I 2n\
^" = (o i)
3.54.
/9
« ^+^ = (o 14
„ /14 ON
«  = c :)■  = c .;) '" '<^'  (T ;,;
2"
(»)^^=(o 33; ('^^ ^"=U 3«,
/3ci 3d,^
/ 5 2\ ,.., / 1/3 1/3
3.59. (1) (^ 3) (n) f ^/9 2/g
15 4 3\ / 8 1 3^
3.60. (i) 107 6
11
5 12
86 5/ \ 10 1 4y
/31/2 17/2 11^
3.61. 9/2 5/2 3
\7 4 51
3.62. Given AAi = /. Then 7 = 7' = (AAi)' = {A'^YAK That is, (A^^)' = (A*)"!.
/ai
.0 ai ...
3.63. A is invertible iff each aj 9^ 0. Then A ^  '
\0
chapter 4
Vector Spaces and Subspaces
INTRODUCTION
In Chapter 1 we studied the concrete structures B" and C" and derived various proper
ties. Now certain of these properties will play the role of axioms as we define abstract
"vector spaces" or, as it is sometimes called, "linear spaces". In particular, the conclu
sions (i) through (viii) of Theorem 1.1, page 3, become axioms [A]][A4], [Afi][M4] below.
We will see that, in a certain sense, we get nothing new. In fact, we prove in Chapter 5
that every vector space over R which has "finite dimension" (defined there) can be identified
with R" for some n.
The definition of a vector space involves an arbitrary field (see Appendix B) whose
elements are called scalars. We adopt the following notation (unless otherwise stated or
implied):
K the field of scalars,
a, &, c or A; the elements of K,
V the given vector space,
u, V, w the elements of V.
We remark that nothing essential is lost if the reader assumes that K is the real field R
or the complex field C.
Lastly, we mention that the "dot product", and related notions such as orthogonality,
is not considered as part of the fundamental vector space structure, but as an additional
structure which may or may not be introduced. Such spaces shall be investigated in the
latter part of the text.
Definition : Let iiT be a given field and let V be a nonempty set with rules of addition and
scalar multiplication which assigns to any u,v GV a sum u + v GV and to
any uGV,kGK a product ku G V. Then V is called a vector space over K
(and the elements of V are called vectors) if the following axioms hold:
[Ai]: For any vectors u,v,w GV, {u + v) +w = u + {vi w).
[A2]: There is a vector in V, denoted by and called the zero vector, for which u + Q — u
for any vector u GV.
[A3] : For each vector uGV there is a vector in V, denoted by —u, for which u + {—u) = 0.
[A4]: For any vectors u,v GV, u + v = v + u.
[Ml]: For any scalar k G K and any vectors u,v GV, k{u + v) = ku + kv.
[M2] : For any scalars a,b GK and any vector u GV, (a + b)u = au + bu.
[Ms]: For any scalars a,b G K and any vector u GV, {ab)u = a{bu).
[Mi]: For the unit scalar 1 G K, lu = u for any vector u GV.
63
64 VECTOR SPACES AND SUBSPACES [CHAP. 4
The above axioms naturally split into two sets. The first four are only concerned with
the additive structure of V and can be summarized by saying that 7 is a commutative group
(see Appendix B) under addition. It follows that any sum of vectors of the form
Vi + V2 + • • • + Vm
requires no parenthesis and does not depend upon the order of the summands, the zero
vector is unique, the negative —u of u is unique, and the cancellation law holds:
u + w = V + w implies u — v
for any vectors u,v,w G V. Also, subtraction is defined by
u — V = u + {—v)
On the other hand, the remaining four axioms are concerned with the "action" of the
field K on V. Observe that the labelling of the axioms reflects this splitting. Using these
additional axioms we prove (Problem 4.1) the following simple properties of a vector space.
Theorem 4.1: Let 7 be a vector space over a field K.
(i) For any scalar kGK and G 7, fcO = 0.
(ii) For gK and any vector uGV, Ou = 0.
(iii) If ku ^ 0, where kGK and uGV, then A; = or m = 0.
(iv) For any scalar kGK and any vector uGV, {k)u = k{u) = ku.
EXAMPLES OF VECTOR SPACES
We now list a number of important examples of vector spaces. The first example is a
generalization of the space R".
Examplje 4.1: Let K be an arbitrary field. The set of all ntuples of elements of K with vector
addition and scalar multiplication defined by
(«!, a2 a„) + (61,62, ...,6„) = (01 + 61,02+62 a„+6„)
and fc(ai. <»2. • ■ • . «n) = C^^i '««2. • • • . ^O
where «<, 64, k&K, is a vector space over K; we denote this space by X". The zero
vector in K» is the wtuple of zeros, = (0, 0, ... , 0). The proof that K" is a vector
space is identical to the proof of Theorem 1.1, which we may now regard as stating
that R" with the operations defined there is a vector space over R.
Example 4.2: Let V be the set of all m X n matrices with entries from an arbitrary field K. Then
y is a vector space over K with respect to the operations of matrix addition and
scalar multiplication, by Theorem 3.1.
Example 4.3: Let V be the set of all polynomials Oo + a^t + Ogt^ + • • ■ + a„t" with coefficienis oj
from a field K. Then y is a vector space over K with respect to the usual operations
of addition of polynomials and multiplication by a constant.
Example 4.4: Let K be an arbitrary field and let X be any nonempty set. Consider the set V of all
functions from X into K. The sum of any two functions f,g eV is the function
f + gGV defined by
{f + g){x) = f(x) + g(x)
and the product of a scalar kEK and a function / e y is the function kfeV
defined by , „ ^
(kf){x) = kf(x)
CHAP. 4]
VECTOR SPACES AND SUBSPACES
65
Then V with the above operations is a vector space over K (Problem 4.5). The zero
vector in V is the zero function which maps each x G X into S K: 0{x) =
for every x G X. Furthermore, for any function f G V, —f is that function in V
for which (—/)(») = —f(x), for every x G X.
Example 45: Suppose S is a field which contains a subfield K. Then E can be considered to be a
vector space over K, taking the usual addition in E to be the vector addition and
defining the scalar product kv of kGK and v S jF to be the product of k and v
as element of the field E. Thus the complex field C is a vector space over the real
field E, and the real field R is a vector space over the rational field Q.
SUBSPACES
Let TF be a subset of a vector space over a field K. W is called a subspace of V if TF is
itself a vector space over K with respect to the operations of vector addition and scalar
multiplication on V. Simple criteria for identifying subspaces follow.
Theorem 4.2: W is& subspace of V if and only if
(i) W is nonempty,
(ii) W is closed under vector addition: v,w G W implies v + w G W,
(iii) W is closed under scalar multiplication: v GW implies kv GW for
every kGK.
Corollary 4.3: W ia a subspace of V if and only if (i) GW (or W # 0), and (ii) v,w GW
implies av + bw G W for every a,b GK.
Example 4.6: Let V be any vector space. Then the set {0} consisting of the zero vector alone, and
also the entire space V are subspaces of V.
Example 4.7: (i)
Let V be the vector space R^. Then the set W consisting of those vectors whose
third component is zero, W — {{a,b,0) : a,b GR}, is a subspace of V.
(ii) Let V be the space of all square nX n matrices (see Example 4.2). Then the
set W consisting of those matrices A = (oy) for which ay = Ojj, called
symmetric matrices, is a subspace of V.
(iii) Let V be the space of polynomials (see Example 4.3). Then the set W consisting
of polynomials with degree — n, for a fixed n, is a subspace of V.
(iy) Let V be the space of all functions from a nonempty set X into the real field R.
Then the set W consisting of all bounded functions in V is a subspace of V.
(A function / € V is bounded if there exists M GR such that /(a;)  M for
every x G X.)
Example 4.8: Consider any homogeneous system of linear equations in n unknowns with, say, real
coefficients:
aiiXi + ay^Xi 4 • • • + ai„a;„ =
a2iXi + a^sx^ + • • • + a2„a;„ —
Recall that any particular solution of the system may be viewed as a point in R".
The set W of all solutions of the homogeneous system is a subspace of R" (Problem
4.16) called the solution space. We comment that the solution set of a nonhomo
geneous system of linear equations in n unknowns is not a subspace of R".
66
VECTOR SPACES AND SUBSPACES
[CHAP. 4
Example 4.9:
Let V and W be subspaces of a vector space V. We show that the intersection
Vr\W i& also a subspace of V. Clearly G C/ and S W since U and W are sub
spaces; whence e UdW. Now suppose m,v e.Ur\W. Then u,v &U and u,v &W
and, since U and W are subspaces,
aw + 6i) G ?7
and
aw + 6v e W
for any scalars a,b€K. Accordingly, au + bv & UnW and so [7nTF is a sub
space of V.
The result in the preceding example generalizes as follows.
Theorem 4.4: The intersection of any number of subspaces of a vector space 7 is a
subspace of V.
LINEAR COMBINATIONS, LINEAR SPANS
Let F be a vector space over a field K and let vi, ...,VmGV. Any vector in V of the
form
aiVi + a2V2 4 • • • + amVm
where the OiGK, is called a linear combination of vi,...,Vm. The following theorem
applies.
Theorem 4.5: Let S be a nonempty subset of V. The set of all linear combinations of
vectors in S, denoted by L{S), is a subspace of V containing S. Further
more, if W is any other subspace of V containing S, then L{S) CW.
In other words, L{S) is the smallest subspace of V containing S; hence it is called the
subspace spanned or generated by S. For convenience, we define L{0) = {0}.
Example 4.10: Let V be the vector space R3. The linear span of any nonzero vector u consists
of all scalar multiples of u; geometrically, it is the line through the origin and the
point u. The linear space of any two vectors u and v which are not multiples of
each other is the plane through the origin and the points u and v.
Example 4.11: The vectors 6i = (1,0,0), eg = (0,1,0) and eg = (0,0,1) generate the vector space
R3. For any vector (a, 6, c) G R^ is a linear combination of the ej; specifically,
(a, b. e) = a(l, 0, 0) + 6(0, 1, 0) + c(0, 0, 1)
= aej + ftej + 063
Example 4.12: The polynomials 1, t, t^, t^, ... generate the vector space V of all polynomials
(in*): y = L(l, t, t^ . . .). For any polynomial is a linear combination of 1 and
powers of t.
CHAP. 4] VECTOR SPACES AND SUBSPACES 67
Example 4.13: Determine whether or not the vector v  (3, 9, 4, 2) is a linear combination of
the vectors u^ = (1, 2, 0, 3), U2 == (2, 3, 0, 1) and Wg = (2, 1, 2, 1), i.e. belongs
to the space spanned by the Mj.
Set r as a linear combination of the Mj using unknowns x, y and z; that is, set
V = XU^ + J/M2 + ZM3:
(3, 9, 4, 2) = !B(1, 2, 0, 3) 4 i/(2, 3, 0, 1) + «(2, 1, 2, 1)
= (x + 2y + 2z, 2x + 3yz, 2«, 3a;  3/ + z)
Form the equivalent system of equations by setting corresponding components equal
to each other, and then reduce to echelon form:
X + 2y + 2z = 3 X + 2y + 2z = 3 x + 2y + 2z  3
2x + 3j/  2 =:= 9 7y + 3z = 15 7y + 3z  15
or or
22 = 4 2z = 4 22 = 4
3a;  y + z = 2 7y  5z = 11 2z = 4
a; + 2/ + 2z = 3
or 7y + Bz  15
22 = 4
Note that the above system is consistent and so has a solution; hence v is a linear
combination of the Mj. Solving for the unknowns we obtain x = 1, y = 3, z — —2.
Thus V — Ui + 3m2 — 2M3.
Note that if the system of linear equations were not consistent, i.e. had no solu
tion, then the vector v would not be a linear combination of the Mj.
ROW SPACE OF A MATRIX
Let A be an arbitrary mxn matrix over a field K:
(Hi ... ai„ \
(I22 . . . a,2n
\fflml flm2 . . . dmn/
The rows of A,
Rl = (ttll, 0,21, ■ . ., am), . . . , Rm = (Oml, am2, . . . , dmn)
viewed as vectors in .K", span a subspace of K" called the row space of A. That is,
row space of A — L{Ri, R2, . . . , Rm)
Analogously, the columns of A, viewed as vectors in K"", span a subspace of X" called the
column space of A.
Now suppose we apply an elementary row operation on A,
(i) Ri <^ Rj, (ii) Ri ^ kRi, k¥'0, or (iii) Ri > kRj + Ri
and obtain a matrix B. Then each row of B is clearly a row of A or a linear combination of
rows of A. Hence the row space of B is contained in the row space of A. On the other
hand, we can apply the inverse elementary row operation on B and obtain A; hence the row
space of A is contained in the row space of B. Accordingly, A and B have the same row
space. This leads us to the following theorem.
Theorem 4.6: Row equivalent matrices have the same row space.
We shall prove (Problem 4.31), in particular, the following fundamental result con
cerning row reduced echelon matrices.
68 VECTOR SPACES AND SUBSPACES [CHAP. 4
Theorem 4.7: Row reduced echelon matrices have the same row space if and only if they
have the same nonzero rows.
Thus every matrix is row equivalent to a unique row reduced echelon matrix called its
row canonical form.
We apply the above results in the next example.
Example 4.14: Show that the space U generated by the vectors
Ml = (1, 2, 1, 3), M2 = (2, 4, 1, 2), and wg = (3, 6, 3, 7)
and the space V generated by the vectors
vi = (1, 2, 4, 11) and v^ = (2, 4, 5, 14)
are equal; that is, U = V.
Method 1. Show that each Mj is a linear combination of v^ and V2, and show that
each Vi is a linear combination of Mj, M2 and M3. Observe that we have to show that
six systems of linear equations are consistent.
Method 2. Form the matrix A whose rows are the Mj, and row reduce A to row
canonical form:
A = I 2 4 1 2 I to
to
Now form the matrix B whose rows are Vi and t>2, and row reduce B to row canonical
form:
1
2
1
3\
1
2
1
3
3
8
to
3
8
6
16/
\o
1
2
1/3 "^
1
8/3
/
,x 24 11\ /I 2 4 UN /I 2 1/3
B = ( ) to („ „ o _o) to (j, 1 _8/3
_ /I 24 11\ /I 24 UN
~ \2 45 14/ VO 3 8/
Since the nonzero rows of the reduced matrices are identical, the row spaces of A
and B are equal and so U = V.
SUMS AND DIRECT SUMS
Let U and W be subspaces of a vector space V. The sum of U and W, written U + W,
consists of all sums u + w where uGU and w &W:
U + W = {u + w:uGU,wGW}
Note that = + eU + W, since OeU.OGW. Furthermore, suppose u + w and
u' + w' belong \joU + W, with u,u' GU and w,w' e W. Then
(u + w) + (u' + w') = {u + u') + {w + w') G U + W
and, for any scalar k, k{u + w) = ku + kw G U + W
Thus we have proven the following theorem.
Theorem 4.8: The sum U + W of the subspaces U and TF of F is also a subspace of V.
Example 4.15: Let V be the vector space of 2 by 2 matrices over R. Let U consist of those
matrices in V whose second row is zero, and let W consist of those matrices m V
whose second column is zero:
 = {(::)^«'— }■  = {(::)■«■■'}
CHAP. 4] VECTOR SPACES AND SUBSPACES 69
Now U and W are subspaces of V. We have:
U+W = {(" o) ■ "'^'"^A *"d VnW = [(I °) : aeR
That is, U+ W^ consists of those matrices whose lower right entry is 0, and Ur\W
consists of those matrices whose second row and second column are zero.
Definition: The vector space V is said to be the direct sum of its subspaces U and W,
denoted by
V = V ® w
if every vector v G F can be written in one and only one way as v = u + w
where u&V and w gW.
The following theorem applies.
Theorem 4.9: The vector space V is the direct sum of its subspaces U and W if and only
if: {i)V ^ U+W, and (ii) UnW = {0}.
Example 4.16: In the vector space R^, let U be the xy plane and let W be the yz plane:
U = {{a, 6, 0) : a, 6 S R} and W = {(0, b,c): h,c& R}
Then R^ = U+W since every vector in R3 is the sum of a vector in U and a vector
in W. However, R* is not the direct sum of U and W since such sums are not
unique; for example,
(3, 5, 7) = (3, 1, 0) + (0, 4, 7) and also (3, 5, 7) = (3, 4, 0) + (0, 9, 7)
Example 4,17: In R3, let U be the xy plane and let W be the z axis:
U =: {(o, 6, 0): a,6GR} and W = {(0, 0, c) : c G R}
Now any vector (a, b, c) G R^ can be written as the sum of a vector in U and a
vector in V in one and only one way:
{a, b, c) = (a, 6, 0) + (0, 0, c)
Accordingly, R3 is the direct sum of U and W, that is, R^ = U ® W.
Solved Problems
VECTOR SPACES
4.1. Prove Theorem 4.1: Let F be a vector space over a field K.
(i) For any scalar kGK and GV, fcO = 0.
(ii) For GK and any vector uGV, Ou = 0.
(iii) If ku — 0, where kGK and uGV, then fc = or u = 0.
(iv) For any kGK and any uGV, {k)u = k{u) =  ku.
(i) By axiom [A^] with m = 0, we have + = 0. Hence by axiom [Mi], fcO = fc(0 + 0) =
fee + fcO. Adding — kO to both sides gives the desired result.
(ii) By a property of K, + = 0. Hence by axiom [Mg], Om = (0 + 0)m = Qu + Ou. Adding  Om
to both sides yields the required result.
70 VECTOR SPACES AND SUBSPACES [CHAP. 4
(iii) Suppose fcw = and k ¥= 0. Then there exists a scalar fc^i such that fc~ifc = 1; hence
u = lu = {k^k)u = kHku) = feiQ =
(iv) Using u + {u) = 0, we obtain = kO = k{u + (m)) = few + k{u). Adding ku to both
sides gives —ku — k(—u).
Using k + (fe) = 0, we obtain = Oit = (fe + (k))u = ku + (k)u. Adding ku to both
sides yields —ku = {—k)u. Thus (—k)u = k(—u) = —ku.
4.2. Show that for any scalar k and any vectors u and v, k{uv) = ku kv.
Using the definition of subtraction {uv = u+ (■ v)) and the result of Theorem 4.1(iv)
{k(—v) = —kv),
k(u v) = k(u + (v)) = ku + k(v) = ku + (kv) = ku  kv
4.3. In the statement of axiom [Mz], (a + b)u = au + bu, which operation does each plus
sign represent?
The + in (a+b)u denotes the addition of the two scalars a and 6; hence it represents the addi
tion operation in the field K. On the other hand, the + in au+ bu denotes the addition of the two
vectors au and bu; hence it represents the operation of vector addition. Thus each + represents a
different operation.
4.4. In the statement of axiom [Ma], (ab)u = a{bu), which operation does each product
represent?
In (ab)u the product ab of the scalars a and 6 denotes multiplication in the field K, whereas the
product of the scalar ab and the vector u denotes scalar multiplication.
In a{bu) the product bu of the scalar 6 and the vector u denotes scalar multiplication; also, the
product of the scalar a and the vector bu denotes scalar multiplication.
4.5. Let V be the set of all functions from a nonempty set X into a field K. For any func
tions f.gGV and any scalar k G K, let f + g and kf be the functions in V defined
as follows:
{f + 9){x)  fix) + g{x) and (kf){x) = kf(x), yfx G X
(The symbol V means "for every".) Prove that 7 is a vector space over K.
Since X is nonempty, V is also nonempty. We now need to show that all the axioms of a vector
space hold.
[All Let f.g.hGV. To show that (f + g) + h = f + (g + h), it is necessary to show that
the function (f + g) + h and the function f + (g + h) both assign the same value to each
a; e X. Now,
((f + g) + h)(x) = if + g){x) + h{x) = (fix) + g{x)) + h(x), Vo; G X
(f+(g + h))(x) = f(x) + (g + h)(x) = f(x) + (g(x) + h(x)), yfxGX
But f(x), g(x) and h(x) are scalars in the field K where addition of scalars is associative; hence
(f(x) + g(x)) + h(x) = f(x) + (g(x) + h(x))
Accordingly, (f + g) + h = f +(g + h).
[AJ: Let denote the zero function: 0(a;) = 0, Va; G X. Then for any function / G V,
(/ + 0)(a;) = f(x) + 0(a!) = f(x) + = f(x), Vo; G X
Thus / + = /, and is the zero vector in V.
CHAP. 4] VECTOR SPACES AND SUBSPACES 71
[A3]: For any function / G V, let / be the function defined by (/)(«) =  f(x). Then,
(/ + (/))(«) = f(x) + (/)(*) = f(x)  f(x) = = Oix), yfx&X
Hence / + (/) = 0.
[AJ: Let f.g^V. Then
(/ + ffKx) = f(x) + gix) = g(x) + f(x) = (g + f)(x), y/x&X
Hence f + g = g + f. (Note that /(*) + g(x) = g(x) + f{x) follows from the fact that /(«) and
g(x) are scalars in the field K where addition is commutative.)
[Ml]: Let f,g&V and k & K. Then
W + 9))i.x) = k((f + g)(x)) = k(f(x) + g(x)) = kf(x) + kg(x)
= (kf)(x) + (kg)(x) = (kf + kg)(x), ^fxeX
Hence k(f + g) = kf + kg. (Note that k(f(x) + g{x)) = kf(x) + kg(x) follows from the fact that
k, f(x) and g(x) are scalars in the field K where multiplication is distributive over addition.)
[M2]: Let /ey and o, 6 6 X. Then
((a+6)/)(a;) = (a+h)f(x) = af(x) + hfi^x) = (af)(x) + 6/(a;)
= (af+hf)(x), VaseX
Hence (a + 6)/ = af + bf.
[Mg]: Let f&V and a, 6 G X. Then,
({ab)f)(x) = (a6)/(x) = a(6/(a;)) = o(6/)(a;) = (a(6/))(a;), Va; G ;f
Hence (ab)f = a(6/).
[AfJ: Let / G y. Then, for the unit leK, (!/)(») = l/(a;) = f{x), V« G X. Hence 1/ = /.
Since all the axioms are satisfied, y is a vector space over K.
4.6. Let V be the set of ordered pairs of real numbers: V = {{a,b): a,bGR}. Show
that V is not a vector space over R with respect to each of the following operations
of addition in V and scalar multiplication on V:
(i) (a, b) + (c, d) = (a + c,b + d) and k{a, b) = {ka, b);
(ii) (a, 6) + (c, d) = (a, 6) and k{a, b) = (fee, kb);
(iii) (a, 6) + (c, d) = (o + c, 6 + d) and fe(a, 6) := (fc^a, fe^ft).
In each case show that one of the axioms of a vector space does not hold.
(i) Let r = l, 8 = 2, v = (3, 4). Then
(r + s)v = 3(3,4) = (9,4)
rv + sv ^ 1(3, 4) + 2(3, 4) = (3, 4) + (6, 4) = (9, 8)
Since (r + s)v ¥= rv + sv, axiom [M^] does not hold.
(ii) Let 0) = (1,2), w = (3,4). Then
v + w  (1, 2) + (3, 4) = (1, 2)
w + v = (3, 4) + (1,2) = (3,4)
Since v + w ¥= w + v, axiom [AJ does not hold.
(iii) Let r = 1, s = 2, i; = (3, 4). Then
(r + s)v = 3(3, 4) = (27, 36)
rv + SV = 1(3, 4) + 2(3, 4) = (3, 4) + (12, 16) = (15, 20)
Thus {r + s)v ¥= rv + sv, and so axiom [M2] does not hold.
72
VECTOR SPACES AND SUBSPACES [CHAP. 4
SUBSPACES
4.7. Prove Theorem 4.2: Wis a. subspace of V if and only if (i) W is nonempty, (ii) v,w eW
implies v + wGW, and (iii) v GW implies kv GW for every scalar kGK.
Suppose W satisfies (i), (ii) and (iii). By (i), W is nonempty; and by (ii) and (iii), the operations
of vector addition and scalar multiplication are well defined for W. Moreover, the axioms [A,], [AJ,
[Ml], [Ma], [Mg] and [MJ hold in W since the vectors in W belong to V. Hence we need only show
that [A2] and [A3] also hold in W. By (i), W is nonempty, say uGW. Then by (iii), Ou  S W
and v + = v for every v G W. Hence W satisfies [Ag]. Lastly, it v G W then (l)v = v £ W
and V + (v) = 0; hence W satisfies [A3]. Thus W is a subspace of V.
Conversely, if TF is a subspace of V then clearly (i), (ii) and (iii) hold.
4.8. Prove Corollary 4.3: W ia a subspace of V if and only if (i) e TF and (ii) v,wGW
implies av + bw GW for all scalars a,b GK.
Suppose W satisfies (i) and (ii). Then, by (i), W is nonempty. Furthermore, if v,w G W then,
by (ii), v + w = lv + lweW; and if v&W and kGK then, by (ii), kv = kv + Ove W. Thus
by Theorem 4.2, Tf^ is a subspace of V.
Conversely, if W is a subspace of V then clearly (i) and (ii) hold in W.
4.9. Let y = R^ Show that W is a subspace of V where:
(i) w = {(a, b,0): a,b G R}, i.e. W is the xy plane consisting of those vectors whose
third component is 0;
(ii) W = {{a,b,c): a + b + c = 0}, i.e. W consists of those vectors each with the
property that the sum of its components is zero.
(i) = (0, 0,0) e W since the third component of is 0. For any vectors 1; = (a, 6, 0), w =
(c, d, 0) in W, and any scalars (real numbers) k and k',
kv + k'w = k(a, b, 0) + k'(c, d, 0)
= (ka, kb, 0) + (fc'c, k'd, 0) = (ka + k'c, kb + k'd, 0)
Thus kv + k'w e W, and so W is a subspace of V.
(ii) = (0, 0,0) GW since + + = 0. Suppose v = (a, b, c), w = (a', b', e') belong to W, i.e.
tt + 6 + c = and a' + 6' + C = 0. Then for any scalars k and k',
kv + k'w = k(a, b, c) + k'(a', b', c')
= (ka, kb, kc) + {k'a', k'b', k'c')
= (ka + k'a', kb + k'b', kc + k'c')
and furthermore,
(ka + k'a') + (kb + k'b') + (kc + k'c') = k(a+ b + c) + k'{a' + b' + e')
= fcO + fc'O =
Thus kv + k'w e W, and so W is a subspace of V.
4.10. Let V = R^ Show that W is not a subspace of V where:
(i) PF = {{a, b,c): a ^ 0}, i.e. W consists of those vectors whose first component is
nonnegative;
(ii) Pf = {(a, b, c): d' + b^ + c^^ 1}, i.e. W consists of those vectors whose length does
not exceed 1;
(iii) W = {(a, 6, c) : a, b, c e Q}, i.e. W consists of those vectors whose components are
rational numbers.
In each case, show that one of the properties of, say. Theorem 4.2 does not hold.
(i) ,, = (1,2,3) GW and fc = 5 e R. But fc. = 5(1,2,3) = (5, 10,15) does not belong to
W since 5 is negative. Hence W is not a subspace of V.
CHAP. 4] VECTOR SPACES AND SUBSPACES 73
(ii) V = (1, 0,0) eW and w = (0, 1, 0)eW. But v + w = (1, 0, 0) + (0, 1, 0) = (1, 1, 0) does not
belong to W since 1^ + 1^ + 0^ = 2 > 1. Hence W Is not a subspace of V.
(iii) v = (1,2,3) GW and k = y/2GK. But fcr = \/2 (1,2,3) = (\/2, 2V2, 3\/2) does not belong to
W since its components are not rational numbers. Hence W is not a subspace of V.
4.11. Let V be the vector space of all square nxn matrices over a field K. Show that W
is a subspace of V where:
(i) W consists of the symmetric matrices, i.e. all matrices A = (otj) for which
(ii) W consists of all matrices which commute with a given matrix T; that is,
W= {AgV: AT = TA}.
(i) OSW since all entries of are and hence equal. Now suppose A = (ay) and B = (6y)
belong to W, i.e. ftjj = ay and 5jj = 6y. For any scalars a, 6 G if , aA + bB is the matrix
whose iientry is aa^ + 66y. But aa^j + 66ji = aoy + 66y. Thus aA + 6B is also symmetric,
and so TF is a subspace of V.
(ii) OeW since or = = TO. Now suppose A,BgW; that is, AT  TA and BT = TB. For
any scalars a,b G K,
{aA + bB)T = (aA)T + {bB)T = a(AT) + b(BT) = a{TA) + b(TB)
= T(aA) + T{hB) = r(aA + 5B)
Thus aA + 6B commutes with T, i.e. belongs to W; hence W is a subspace of V.
4.12. Let V be the vector space of all 2 x 2 matrices over the real field R. Show that W
is not a subspace of V where:
(i) W consists of all matrices with zero determinant;
(ii) W consists of all matrices A for which A^ = A.
(i) (Recall that det( ^ = ad — be.) The matrices A = f ) and B = f j belong
to W since det(A) = and det(B) = 0. But A + B = f j does not belong to W since
det (A + B) = 1. Hence W is not a subspace of V.
<ll)
(ii) The unit matrix i = ( » , ) belongs to W since
 = i'l X n = (' ") = ^
/1
Vo 1
/2 0\
But 2/ = ( I does not belong to W since
Hence W is not a subspace of V.
4
4
¥^ 2/
4.13. Let V be the vector space of all functions from the real field R into R. Show that W
is a subspace of V where:
(i) w = {f: /(3) = 0}, i.e. W consists of those functions which map 3 into 0;
(ii) W = {f: /(7) = /(!)}, i.e. W consists of those functions which assign the same
value to 7 and 1;
(iii) W consists of the odd functions, i.e. those functions / for which /(«) =  /(«)•
74 VECTOR SPACES AND SUBSPACES [CHAP. 4
Here denotes the zero function: 0(a;) = 0, for every x & R.
(i) OeW since 0(3) = 0. Suppose f.gGW, i.e. /(3) = and sr(3) = 0. Then for any real
numbers a and b,
(af + bg)(3) = (1/(3) + bg{3) = aO + 60 =
Hence af + bg & W, and so TF is a subspace of V.
(11) OeW since 0(7) = = 0(1). Suppose f.g^W, I.e. /(7) = /(I) and «r(7) = flr(l). Then, for
any real numbers a and b,
(af+bg){7) = af(7) + bg(l) = a/(l) + 6ff(l) = (a/+6fl)(l)
Hence af + bg & W, and so W is a subspace of V.
(ill) OeW since 0(a;) = = 0 = 0(a;). Suppose f,g&W, i.e. /(x) = /(«) and g{x) =
— g{x). Then for any real numbers a and b,
(a/+6f?)(a!) = a/(a;) + 6flr(a;) =  af(x)  bg{x) =  (a/(x) + 6flr(a;)) = (af + bg)(x)
Hence af + bg €: W, and so W is a subspace of V.
4.14. Let V be the vector space of all functions from the real field R into R. Show that W
is not a subspace of V where:
(i) W={f: /(7) = 2 + /(!)};
(ii) W consists of all nonnegative functions, i.e. all function / for which f{x) ^ 0,
yfxGR.
(i) Suppose f.geW, I.e. /(7) = 2 + /(l) and flr(7) = 2 + flr(l). Then
(f + g)i^) = /(7) + flC?) = 2 + /(I) + 2 + flr(l)
= 4 + /(I) + ir(l) = 4 + (/ + flr)(l) ^ 2 + (/ + sf)(l)
Hence f + g^W, and so Tl' is not a subspace of V.
(11) Let fc = 2 and let / G V be defined by /(a) = x^. Then / G W since /(«) = a;2 s= Q,
V« e R. But (fc/)(5) = fe/(5) = (2)(52) = 50 < 0. Hence kf € W, and so W Is not a sub
space of V.
4.15. Let V be the vector space of polynomials ao + ait + a^t^ + • • • + a„f" with real coef
ficients, i.e. Oi e R. Determine whether or not VF is a subspace of V where:
(i) W consists of all polynomials with integral coefficients;
(ii) W consists of all polynomials with degree — 3;
(iii) W consists of all polynomials &o + &it^ + h^t^ + • • • + bnt^", i.e. polynomials with
only even powers of t.
(1) No, since scalar multiples of vectors in W do not always belong to W. For example, v =
3 + 5t + 7(2 e ^ but ^i' = f + « + 1*^ ^ W. (Observe that W is "closed" under vector
addition, i.e. sums of elements in W belong to W.)
(ii) and (iii). Yes. For, in each case, W is nonempty, the sum of elements in W belong to W, and
the scalar multiples of any element in W belong to W.
4.16. Consider a homogeneous system of linear equations in n unknowns Xi, . ..,Xn over a
field K: , ^ n
anXi + ai2X2 + • ■ ■ + ainXn =
aziXl + 022*2 + • ■ • + a2nX„ =
OmlXl + am2X2 + • • • + dmnXn =
Show that the solution set W is a subspace of the vector space K".
= (0, 0, . . . , 0) e PF since, clearly,
a«0 + ajjO + • • • + ttinO = 0, for t = 1, . . .,m
CHAP. 4] VECTOR SPACES AND SUBSPACES 75
Suppose M = («!, M2, • • . , M„) and v = (vj, Vg, . . . , v„) belong to W, i.e. for i — I, . . .,Tn
«il"l + ai2W2 + • ■ • + ttinMn =
OjiVi + ai2'y2 + • • • + ai„v„ =
Let a and 6 be scalars in K. Then
au + bv = (ciMx + 6^1, au2 + 6^2. • ■ • > <*w„ + 6v„)
and, for i = 1, . . . , m,
aji(aMi + bvj) + ai2(au2 + 6f 2) + • " • + ain(aMn + bv^)
= o(ajiMi + OJ2M2 + • • • + «!„«*„) + 6(ajii;i + (ij2i;2 + • • • + aini'n)
= aO + 60 =
Hence au + 6v is a solution of the system, i.e. belongs to W. Accordingly, W is a subspace of K".
LINEAR COMBINATIONS
4.17. Write the vector v = (1, —2, 5) as a linear combination of the vectors ei = (1, 1, 1),
62 = (1,2, 3) and 63 = (2, 1,1).
We wish to express v as v = xei + 3/62 + ze^, with x, y and z as yet unknown scalars. Thus
we require
(1, 2, 5) = x{l, 1, 1) + j/(l, 2, 3) + z(2, 1, 1)
= (x, X, x) + (y, 2y, 3y) + (22, z, z)
= (a; + 2/ + 2z, a; + 2j/ — z, a; + 32/ + 2)
Form the equivalent system of equations by setting corresponding components equal to each other,
and then reduce to echelon form:
x+ y + 2z = 1 x + y + 2z = 1 x + y + 2z = 1
X + 2y — z = —2 or j/  3z = 3 or y  Sz = S
x + 3y + z = 5 2y  z = 4, 5z=10
Note that the above system is consistent and so has a solution. Solve for the unknowns to obtain
X = —6, y = B, z — 2. Hence v = — 6ei + 862 + 263.
4.18. Write the vector v = (2, 5, 3) in R^ as a linear combination of the vectors Ci =
(1,3,2), 62 = (2, 4,1) and 63 = (1,5, 7).
Set V as a linear combination of the Cj using the unknowns x, y and z: v = xe^ + j/eg + zeg.
(2, 5, 3) = x{\, 3, 2) + y{2, 4, 1) + z(l, 5, 7)
= {x + 2y + z, 3x 4y5z,2xy + 7z)
Form the equivalent system of equations and reduce to echelon form:
x + 2y+z=2 x + 2y+z2 x + 2y + z = 2
3x  4y  5z = 5 or 2y  2z = 1 or 2^  2z = 1
2x — y + Tz = 3 5y + 5z = 1 = 3
The system is inconsistent and so has no solution. Accordingly, v cannot be written as a linear com
bination of the vectors Ci, e^ and 63.
4.19. For which value of k will the vector u = (1, 2, k) in R" be a linear combination of
the vectors v = (3, 0, 2) and w = (2, 1, 5) ?
Set u = XV + yw:
(1, 2, fe) = a(3, 0, 2) + j/(2, 1, 5) = (3a; + 2y, y, 2x  5y)
Form the equivalent system of equations:
3x + 2y = 1, y = 2, 2x  5y = k
By the first two equations, x = —1, j/ = 2. Substitute into the last equation to obtain k = —8.
76 VECTOR SPACES AND SUBSPACES [CHAP. 4
4^0. Write the polynomial v = t^ + 4t — 3 over R as a linear combination of the poly
nomials ei = t^2t + 5, 62 = 2t^  St and ca = t + S.
Set D as a linear combination of the ej using the unknowns x, y and z: v = xe^ + ye^ + 263.
t2 + 4t  3 = a;(t22t + 5) + 3/(2«23t) + 2(f + 3)
= a;t2  2xt + 5a; + 22/t2  s^/t + zt + 3z
 {x + 2y)fi + {2x3y + z)t + (5a; + 3z)
Set coefficients of the same powers of t equal to each other, and reduce the system to echelon form:
x + 2y = 1 X + 2y = 1 x + 2y = 1
—2x — 3j/+z=4 or 2/+z=6 or 2/ + z=6
hx + 3z = 3 IQy + 3z = 8 13z = 52
Note that the system is consistent and so has a solution. Solve for the unknowns to obtain
a; = 3, 2/ = 2, z = 4. Thus v  Ssj + ie^ + 463.
4.21. Write the matrix E — I j as a linear combination of the matrices A =
;j).a?)(:x
Set £ as a linear combination of A,B,C using the unknowns x,y,z: E — xA + yB + zC.
3 IN /I 1\ /O 0\ /O 2
1 _iy = ^'(i oy + \i 1) + ^o 1
X iKN,/0 0N/0 2z\ / X X + 2z
/O 22
\0 s
X 0/ \y VJ \0«/ \a; + 2/ y — z
Form the equivalent system of equations by setting corresponding entries equal to each other:
a; = 3, X + y — 1, x + 2z = 1, y — z = —1
Substitute a; = 3 in the second and third equations to obtain y = —2 and z = —1. Since these
values also satisfy the last equation, they form a solution of the system. Hence E = BA — 2B — C.
4.22. Suppose m is a linear combination of the vectors Vi, . . .,Vm and suppose each Vi is a
linear combination of the vectors Wi, . . . , Wn
u = aiVi + a2V2 + • • • + OmVm and Vi = haWi + baWi + ■ • • + bi„w„
Show that u is also a linear combination of the wu Thus if ScL{T), then L{S) (ZL{T).
u = a^Vi + a^v^ + • • • + a^v^
= ai(6iiWi + • • • + 6i„W„) + 02(62l"'l + • ■ • + b2nWn) + • • ' + a^(h.ml1»l + • • • + frmn'"'™)
= (diftji + (12621 + • • ■ + am6mi)wi + • • • + (ai6i„ + a^h^n + • • • + aJi^^Wn
m m / n \ n / 7n \
or simply u  "S, ^in = 2 ^i ( 2 h'^j ) = 2 ( 2 cuba ) Wj
i=l i=l \3=1 / 3=1 \i=l /
LINEAR SPANS, GENERATORS
4.23. Show that the vectors u = (1, 2, 3), v = (0, 1, 2) and w = (0, 0, 1) generate W.
We need to show that an arbitrary vector (a, 6, c) S R3 is a linear combination of u, v and w.
Set (a, 5, c) = xu + yv + zw:
(a, b, c) = x(l, 2, 3) + 1/(0, 1, 2) + z(0, 0, 1) = {x, 2x + y,Sx + 2y + z)
CHAP. 4] VECTOR SPACES AND SUBSPACES 77
Then form the system of equations
X = a « + 2i/ + 3a;=:c
2x + y =6 or y + 2x = h
Zx + 2y + z = c X = a
The above system is in echelon form and is consistent; in fact x — a, y — b — 2a, z = e — 2b + a
is a solution. Thus u, v and w generate R^.
4.24. Find conditions on a, b and c so that (a, b, c) G W belongs to the space generated by
u = (2, 1, 0), V = (1, 1, 2) and w = (0, 3, 4).
Set (a, 6, c) as a linear combination of u, v and w using unknowns x, y and z: (a, b, c) =
xu + yv + zw.
{a, b, e) = x(2, 1, 0) + j/(l, 1, 2) + z{0, 3, 4) = {2x + y,xy + Sz, 2y  4z)
Form the equivalent system of linear equations and reduce it to echelon form:
2x + y =a 2x + y = a 2x + y =a
X — y + Zz — b or 3j/  6z = a  26 or Zy  ^z = a  2b
2y  Az = c 2y  iz = c = 2a  46  3c
The vector (a, b, c) belongs to the space generated by u, v and w if and only if the above system is
consistent, and it is consistent if and only if 2a  46  3c = 0. Note, in particular, that u, v and
w do not generate the whole space R^.
4.25. Show that the xy plane W = {(a, b, 0)} in R^ is generated by u and v where: (i) u =
(1, 2, 0) and v = (0, 1, 0); (ii) u = (2, 1, 0) and v = (1, 3, 0).
In each case show that an arbitrary vector (a, b,0)eW is a linear combination of u and v.
(i) Set (a, b, 0) = xu + yv:
{a, b, 0) = x(l, 2, 0) + y(0, 1, 0) = (x, 2x + y, 0)
Then form the system of equations
X = a y + 2x — b
2x + y = b or x = a
=
The system is consistent; in fact x = a, y = b2a is a solution. Hence u and v generate W.
(ii) Set (o, 6, 0) = xu + yv:
{a, 6, 0) = x{2, 1, 0) + y(l, 3, 0) = (2x + y,x + 3y, 0)
Form the following system and reduce it to echelon form:
2x + y = a 2x + y = a
—x + Sy = b or 7y  a + 2b
=
The system is consistent and so has a solution. Hence W is generated by u and v. (Observe
that we do not need to solve for x and y; it is only necessary to know that a solution exists.)
4.26. Show that the vector space V of polynomials over any field K cannot be generated by
a finite number of vectors.
Any finite set S of polynomials contains one of maximum degree, say m. Then the linear span
L{S) of S cannot contain polynomials of degree greater than m. Accordingly, V f^ L{S), for any
finite set S.
78 VECTOR SPACES AND SUBSPACES [CHAP. 4
4.27. Prove Theorem 4.5: Let S be a nonempty subset of V. Then L{S), the set of all
linear combinations of vectors in S, is a subspace of V containing S. Furthermore, if
W is any other subspace of V containing S, then L{S) C W.
If V G S, then Iv = v G L{S); hence S is a subset of L{S). Also, L(S) is nonempty since S is
nonempty. Now suppose v,wGL(S); say,
V — ai^Ui + • • ■ + a^nVm and w — b^Wi + • • ■ + 6„w„
where Vi, w^ G S and fflj, 5j are scalars. Then
V + w = a^vi + ■ • • + a^v^ + b^Wi + • • • + 6„w„
and, for any scalar k,
kv  A;((iii;i + • • • + a^v^  ka^v^ + • • • + ka^v^
belong to L(S) since each is a linear combination of vectors in S. Accordingly, L(S) is a subspace
of V.
Now suppose W is a subspace of V containing S and suppose v^, . . .,v^E. S cW. Then all
multiples a^Vi Om'^'m ^ ^> "where Oj G K, and hence the sum a^v^ + • • • + a^Vm ^ ^ That
is, W contains all linear combinations of elements of S. Consequently, L(S) c T^ as claimed.
ROW SPACE OF A MATRIX
4.28. Determine whether the following matrices have the same row space:
Row reduce each matrix to row canonical form:
^  r. z z) to (J ; \) to (; I I
B = f I ; r ) to (' ~ ~ ) to
c =
(^
1
5
3
13
(s
1
2
2
3
/I
1
l'
4
3
1
V
1
3/
Vo 1
3>
to I 1 3 to I 1 3 I to
2 6/
Since the nonzero rows of the reduced form of A and of the reduced form of C are the same,
A and C have the same row space. On the other hand, the nonzero rows of the reduced form of B
are not the same as the others, and so B has a different row space.
4.29. Consider an arbitrary matrix A = (a«). Suppose u = (&i, ...,&«) is a linear com
bination of the rows Ri, .. .,Rm of A; say u = kiRi + • • • + kmRm. Show that, for each
i, bi — kiaii + feozi + • • • + kmOmi where an, . . ., Omi are the entries of the ith column
of A.
We are g:iven u = ftjiJi + • • • + k^^R^; hence
(6i K) = fci(an. • • • . Om) + • • • + fem(ami. • • • . Omn)
= (feifflii + • • • + fc^Oml) • • • > ^^l^ml + • ■ • + kmO'mn)
Setting corresponding components equal to each other, we obtain the desired result.
CHAP. 4]
VECTOR SPACES AND SUBSPACES
79
4.30. Prove: Let A = (an) be an echelon matrix with distinguished entries aij^, a^^, . . ., ttrj,,
and let B — (&«) be an echelon matrix with distinguished entries bik^, &2fc2» • • • . bsk;.
Olj ******
\
A =
a2i.
a,v,
/ bi
B
^ 4i :]c ^ ifc 4i 9i
Osfc ♦ ^ ♦ V v
Suppose A and B have the same row space. Then the distinguished entries of A and
of B are in the same position: ji — fci, 32 = kz, . . ., jr = kr, and r = s.
Clearly i4 = if and only if B = 0, and so we need only prove the theorem when r — 1
and s — 1. We first show that ij = k^. Suppose ii < k^. Then the j^th. column of B is zero.
Since the first row of A is in the row space of B, we have by the preceding problem, Oi^^ =
CiO + C2O + • • • + cji = for scalars Cj. But this contradicts the fact that the distinguished
element a^ # 0. Hence Ji — k^, and similarly fei — ij. Thus j'l = fcj.
Now let A' be the submatrix of A obtained by deleting the first row of A, and let B' be the
submatrix of B obtained by deleting the first row of B. We prove that A' and B' have the same
row space. The theorem will then follow by induction since A' and B' are also echelon matrices.
Let R = («!, 02, . . .,a„) be any row of A' and let R^, ...,B,n ^^ the rows of B. Since R is in
the row space of B, there exist scalars d^, . . .,d^ such that R — diRi + ^2^2 + • • • + dmRm Since
A is in echelon form and R is not the first row of A, the ^ith entry of R is zero: aj = for
i = ;j = fej. Furthermore, since B is in echelon form, all the entries in the fcjth column of B are
except the first: 61^.^ ¥= 0, but 62^1 = ^> ■■■> ^rnkj = 0. Thus
= Ofcj = difeifcj + dgO + • • • + d„0
d,b.
Now 6ifc # and so di = 0. Thus B is a linear combination of R^, . . .,Bm and so is in the row
space of B'. Since R was any row of A', the row space of A' is contained in the row space of B'.
Similarly, the row space of B' is contained in the row space of A'. Thus A' and B' have the same
row space, and so the theorem is proved.
4.31. Prove Theorem 4.7: Let A = {ckj) and B = (&«) be row reduced echelon matrices.
Then A and B have the same row space if and only if they have the same nonzero rows.
Obviously, if A and B have the same nonzero rows then they have the same row space. Thus
we only have to prove the converse.
Suppose A and B have the same row space, and suppose R ¥= is the ith row of A. Then there
exist scalars Cj, . . . , c^ such that
R  CiRi + C2R2 + • • • + c^Rs W
where the Ri are the nonzero rows of B. The theorem is proved if we show that R — R^, or
Cj = 1 but Cfc = for A; # t.
Let ay be the distinguished entry in R, i.e. the first nonzero entry of R. By (1) and Problem 4.29,
««i = Cl&ljj + «262Ji + • • • + C,b,j. (2)
But by the preceding problem 6y. is a distinguished entry of B and, since B is row reduced, it is
the only nonzero entry in the ijth column of B. Thus from (2) we obtain Oy^ = Cjfty.. However,
oy = 1 and 6y = 1 since A and B are row reduced; hence Cj = 1.
Now suppose k ¥= i, and b^j. is the distinguished entry in R^. By (i) and Problem 4.29,
a„. = Cibij^ + Hb2j^ +
"«fc
+ C,b,j^
(S)
80 VECTOR SPACES AND SUBSPACES [CHAP. 4
Since B is row reduced, bj^j^ is the only nonzero entry in the i^th column of B; hence by (3),
"ijfc — OkHj,^ Furthermore, by the preceding problem a,.j is a distinguished entry of A and, since
A is row reduced, a^^ = 0. Thus c^b^j^ = and, since b^j^ = 1, c,, = 0. Accordingly R = R^
and the theorem is proved.
4.32. Determine whether the following matrices have the same column space:
/l 3 5\ 112 3^
A = 1 4 3 , B = 2 3 4
\1 1 9/ \ 7 12 17y
Observe that A and B have the same column space if and only if the transposes A* and B* have
the same row space. Thus reduce A' and J5' to row reduced echelon form:
A' = 3 4 1 to 1 2 to 1 2 to
1
1
1
1
2
0/
'l
2
7
1
2
B* = 2 3 12 to 1 2 to 1 2 to
\s 4 17/ \0 2 4/ \0 0/
Since A* and B* have the same row space, A and B have the same column space.
4.33. Let jR be a row vector and B a matrix for which RB is defined. Show that RB is a
linear combination of the rows of B. Furthermore, if A is a matrix for which AB is
defined, show that the row space of AB is contained in the row space of B.
Suppose i? = (aj, ttg, . . . , a^) and B = (6y). • Let 5i, ...,B^ denote the rows of B and
Bi, . . . , B» its columns. Then
RB = {R'B^.R'B^, ...,R'B^)
= (aibii + 02621 + ^ ttm^ml. ai*12 + 02*22 + • ■ • + am&m2. ■ ■ • . «l&ln + 02*2n ^ 1" am&mn)
= ai(6ii, 612, . . . , 6i„) + a2(b2i, 622 &2n) + • • • + am(&ml. 6m2. • • • . bmn)
= a^Bi + a252 + • • • + amBm
Thus fijB is a linear combination of the rows of B, as claimed.
By Problem 3.27, the rows of AB are RiB where i?j is the tth row of A. Hence by the above
result each row of AB is in the row space of B. Thus the row space of AB is contained in the row
space of B.
SUMS AND DIRECT SUMS
4.34. Let U and W be subspaces of a vector space V. Show that:
(i) U and W are contained in JJ + W;
(ii) U + W \& the smallest subspace of V containing U and W, that is, U + W is the
linear span of C/ and PF: C/ + W^ = L(C7, W).
(i) Let M G [/. By hypothesis TF is a subspace of V and so d &W. Hence m = m + OS J7+W.
Accordingly, U is contained in U +W. Similarly, W is contained in U + W.
(ii) Since 17 + W is a subspace of V (Theorem 4.8) containing both U and Tl', it must also contain
the linear span of V and W: L{JJ,W) (ZU + W.
On the other hand, if v GU +W then v — u + w = lu+lw where uGU and w &W\
hence v is a linear combination of elements in UuW and so belongs to L{'U,W). Thus
U + W (Z L(U, W).
The two inclusion relations give us the required result.
CHAP. 4] VECTOR SPACES AND SUBSPACES 81
4.35. Suppose U and W are subspaces of a vector space V, and that {im} generates U and
{Wj) generates W. Show that {Ui,W}), i.e. {Vn} U {Wj), generates U + W.
Let V ^U +W. Then v = u\w where u G U and w G W. Since {mJ generates ?7, tt is a
linear combination of Mj's; and since {Wj} generates W, w is a linear combination of Wj's:
u = ffi^j + ciiUi + • • ■ + a„Mj , ttj G K
w = b^Wj^ + 62WJ2 + • • • + K'Wj^, bj e K
Thus V = u + w = ttiMj + a2Ui + • ■ • + a^Mj + biW, + b2Wj + • • • + fc^Wj
and so {mj, w^} generates i7 + TF.
4.36. Prove Theorem 4.9: The vector space V is the direct sum of its subspaces U and W
if and only ii (i) V = U + W and (ii) UnW = {0}.
Suppose V = U ® W. Then any 1; G V can be uniquely written in the form v = u + w
where uG U and w G W. Thus, in particular, V = U + W. Now suppose v G UnW. Then:
(1) V  V + where vGU, G W; and (2) v = + v where OGU, v GW
Since such a sum for ■« must be unique, v — 0. Accordingly, UCiW = {0}.
On the other hand, suppose V  U +W and UnW = {0}. Let vGV. Since V = U + W,
there exist uG U and w S W such that v = u + w. We need to show that such a sum is unique.
Suppose also that v = u' + w' where u' G U and w' G W. Then
u + w = u' + w' and so u — u' = w' — w
But uu'GU and w'  w 6 W; hence by UnW = {0},
u — u' — 0, w' — w = and so u = u', w = w'
Thus such a sum for f G V is unique and V = C7 ® TF.
4.37. Let U and PT be the subspaces of R^ defined by
U = {{a, b,c): a = b = c} and W = {(0, 6, c)}
(Note that PF is the yz plane.) Show that R^ = U ® W.
Note first that JJnW' = {0}, for v = (a, b, c) G UnW implies that
a = b = c and a = which implies a = 0, 6 = 0, c =
i.e. V = (0, 0, 0).
We also claim that R^ = U + W. For if 1; = (a, 6, c) S RS, then v = (a, a, a) + (0,ba,c a)
where (a, a, a) S C7 and (,0,b  a, c a) GW. Both conditions, UnW = {0} and R3 = U + W,
imply R3 = C7 TF.
4.38. Let V be the vector space of tisquare matrices over a field R. Let U and W be the
subspaces of symmetric and antisymmetric matrices, respectively. Show that
V = U ® W. (The matrix M is symmetric ifi M  M*, and antisymmetric iff
M* = M.)
We first show that V = U + W. Let A be any arbitrary wsquare matrix. Note that
A = ^(A + A*) + i(A  At)
We claim that (A + A') G 17 and that ^(A  A«) G W. For
{^{A+At)y = i(A+A«)' = i(A« + A«) = ^{A+A')
that is, J^CA + A') is symmetric. Furthermore,
(^(AAt))' = i(AA')' : i(AtA) = ^(AA'^)
that is, J(A — A') is antisymmetric.
We next show that UnW = {0}. Suppose MGUnW. Then M = M' and M^ = M which
implies M = M or M = 0. Hence I7nW = {0}. Accordingly, V= U®W.
82 VECTOR SPACES AND SUBSPACES [CHAP. 4
Supplementary Problems
VECTOR SPACES
4.39. Let y be the set of infinite sequences (a^, a^, . . .) in a field K with addition in V and scalar multi
plication on V defined by
(«!, a2. •••) + (6i, &2. •••) = (ai + 6i, 02+ 62, . . .)
k(ai, 02, . . . ) = (fcaj, ka2, . . . )
where aj, bj, k G K. Show that V is a vector space over K.
4.40. Let V be the set of ordered pairs (a, 6) of real numbers with addition in V and scalar multiplication
on V defined by
(a, 6) + (c, d) = {a+ c,b + d) and k{a, b) = {ka, 0)
Show that V satisfies all of the axioms of a vector space except [AfJ : lu = u. Hence [^4] is not a
consequence of the other axioms.
4.41. Let V be the set of ordered pairs (a, b) of real numbers. Show that V is not a vector space over R
with addition in V and scalar multiplication on V defined by:
(i) {a,b) + (c,d) = {a + d,b + c) and k(a, b) = (ka, kb);
(ii) (c, 6) + (c, d) = (a + c,b + d) and k(a, b) = (a, 6);
(iii) (a, b) + (c, d) = (0, 0) and k{a, b)  {ka, kb);
(iv) (a, 6) + (c, d) = (ac, bd) and k{a, b) = (ka, kb).
4.42. Let V be the set of ordered pairs (zi, z^) of complex numbers. Show that V is a vector space over the
real field R with addition in V and scalar multiplication on V defined by
(zj, Z2) + (wi, W2) = («i + Wi, «2 + '"'2) and k{zi, 22) = {kzi, kz2)
where z^, Z2, Wi, W2 ^ C and k GB,.
4.43. Let y be a vector space over K, and let F be a subfield of K. Show that V is also a vector space
over F where vector addition with respect to F is the same as that with respect to K, and where
scalar multiplication by an element k G F is the same as multiplication by k as an element of K.
4.44. Show that [A4], page 63, can be derived from the other axioms of a vector space.
4.45. Let U and W be vector spaces over a field K. Let V be the set of ordered pairs (u, w) where u
belongs to U and w to W: V = {{u, w) : uGU, w G W}. Show that y is a vector space over K
with addition in V and scalar multiplication on V defined by
(u, w) + (u', w') = (u + u',w + w') and k(u, w) = (ku, kw)
where u, u' G U, w,w' GW and k G K. (This space V is called the external direct sum of U
and W.)
SUBSPACES
4.46. Consider the vector space V in Problem 4.39, of infinite sequences (a^, 02, . . .) in a field K. Show
that TF is a subspace of V if:
(i) W consists of all sequences with as the first component;
(ii) W consists of all sequences with only a finite number of nonzero components.
4.47. Determine whether or not W is a subspace of RS if W consists of those vectors (a, b, c) G RS for
which: (i) a = 26; (ii) a ^ b ^ c; (iii) ab = 0; (iv) a = b  c; (y) a ^ 6^; (vi) kia + kib + kgfi = 0,
where k^ S R.
4.48. Let y be the vector space of wsquare matrices over a field K. Show that T^ is a subspace of V if
W consists of all matrices which are (i) antisymmetric (A« = A), (ii) (upper) triangular,
(iii) diagonal, (iv) scalar.
CHAP. 4] VECTOR SPACES AND SUBSPACES 83
4.49. Let AX = B be a nonhomogeneous system of linear equations in n unknowns over a field K.
Show that the solution set of the system is not a subspace of K".
4.50. Let V be the vector space of all functions from the real field R into R. Show that W is a subspace
of V in each of the following cases.
(i) W consists of all bounded functions. (Here / : R ^ R is bounded if there exists Af G R such
that /(a;) ^ M, Va; G R.)
(ii) W consists of all even functions. (Here / : R * R is even if /(— «) = f{x), Va; G R.)
(iii) W consists of all continuous functions.
(iv) W consists of all difTerentiable functions.
(v) W consists of all integrable functions in, say, the interval — a; — 1.
(The last three cases require some knowledge of analysis.)
4.51. Discuss whether or not R^ is a subspace of R^.
4.52. Prove Theorem 4.4: The intersection of any number of subspaces of a vector space V is a subspace
of y.
4.53. Suppose U and W are subspaces of V for which UuW is also a subspace. Show that either
UcW or WcU.
LINEAR COMBINATIONS
4.54. Consider the vectors u = (1, 3, 2) and v = (2, 1, 1) in R3.
(i) Write (1, 7, —4) as a linear combination of u and v.
(ii) Write (2, —5, 4) as a linear combination of u and v.
(iii) For which value of k is (1, k, 5) a linear combination of u and vt
(iv) Find a condition on a, b and c so that (a, 6, e) is a linear combination of u and v.
4.55. Write m as a linear combination of the polynomials v = 2t^ + 3t — i and w = t^2t—Z where
(i) M = 3*2 + 8«  5, (ii) M = 4<;2  6t  1.
4.56. Write B as a linear combination of ^ ((._■,) > ^ = (_j n) *"d *^ ~ ( )
where: (i) E = Q "^ ; (ii) E = (J _l^ .
LINEAR SPANS, GENERATORS
4.57. Show that (1, 1, 1), (0, 1, 1) and (0, 1, —1) generate R*, i.e. that any vector (a, b, e) is a linear com
bination of the given vectors.
4.58. Show that the yz plane W = {(0, b, c)} in R* is generated by: (i) (0, 1, 1) and (0, 2, 1); (ii) (0, 1, 2),
(0, 2, 3) and (0, 3, 1).
4.59. Show that the complex numbers w = 2 + 3t and z l — 2i generate the complex field C as a
vector space over the real field R.
4.60. Show that the polynomials {lt)», (1  1)^, 1 — t and 1 generate the space of polynomials of
degree — 3.
4.61. Find one vector in R3 which generates the intersection of U and W where U is the xj/ plane:
U = {{a, b, 0)}, and W is the space generated by the vectors (1, 2, 3) and (1, —1, 1).
4.62. Prove: L(S) is the intersection of all the subspaces of V containing S.
84 VECTOR SPACES AND SUBSPACES [CHAP. 4
4.63. Show that L{S) = L(S u{0}). That is, by joining or deleting the zero vector from a set, we do not
change the space generated by the set.
4.64. Show that if S c T, then L(S) c L(T).
4.65. Show that LmS)) = L(S).
ROW SPACE OF A MATRIX
4.66. Determine which of the following matrices have the same row space:
/l 1 3\
.<: . :,, (.raO' '^ = 1:'°
(3 4 5)'
4.67. Let Ml = (1, 1, 1), U2 = (2, 3, 1), M3 = (3, 1, 5)
vi = (1,1,3), V2 = (3,2,8), Vs = (2,1,3)
Show that the subspace of R* generated by the it; is the same as the subspace generated by the Vj.
4.68. Show that if any row of an echelon (row reduced echelon) matrix is deleted, then the resulting
matrix is still in echelon (row reduced echelon) form.
4.69. Prove the converse of Theorem 4.6: Matrices with the same row space (and the same size) are
row equivalent.
4.70. Show that A and B have the same column space iff A« and B* have the same row space.
4.71. Let A and B be matrices for which AB is defined. Show that the column space of AB is contained
in the column space of A.
SUMS AND DIRECT SUMS
4.72. We extend the notion of sum to arbitrary nonempty subsets (not necessarily subspaces) S and T of
a vector space V by defining S + T = {s + t : sG S, tG T}. Show that this operation satisfies:
(i) commutative law: S+ T = T + S;
(ii) associative law: (Si + S2) + S3 = Si + (S2 + S3) ;
(iii) S + {0} = {0} + S = S;
(iv) S + V = V + S = V.
4.73. Show that for any subspace W of a vector space V, W + W = W.
4.74. Give an example of a subset S of a vector space V which is not a subspace of V but for which
(i) S + S = S, (ii) S + S C S (properly contained).
4.75. We extend the notion of sum of subspaces to more than two summands as follows. If T^i, W^, . . . , T^„
are subspaces of V, then
Wi + W2++W„ = {wi + Wi+''+w^: WiGWi)
Show that:
(i) L{Wi, W2 W„) ^ W, + W2+ ■■■ + W„;
(ii) if Si generates Wi, i = 1, . . .,n, then Si U S2 U • • • U S„ generates W^ + W2 + • • ■ + Wn
4.76. Suppose U, V aijd W are subspaces of a vector space. Prove that
(UnV) + (UnW) c Un(V+W)
Find subspaces of B? for which equality does not hold.
CHAP. 4] VECTOR SPACES AND SUBSPACES 85
4.77. Let U, V and W be the following subspaces of R^:
U = {(a, b,c): a+b + c = 0}, V = {(a, b,c): a = c}, W = {(0, 0, c) : c e K}
Show that (i) R3 = V + V, (ii) B,» =^ V + W, (iii) R^ = V + W. When is the sum direct?
4.78. Let V be the vector space of all functions from the real field B into B. Let U be the subspace of
even functions and W the subspace of odd functions. Show that V = U ®W. (Recall that / is
even iff f{x)  f{x), and / is odd iff f(x) = f(x).)
4.79. Let Wi, W^, ... be subspaces of a vector space V for which Wy<zW.i,<Z • ■ . Let PF = Wj U TFj U • • ■ .
Show that W is a subspace of Y.
4.80. In the preceding problem, suppose Sj generates W^, i = 1, 2, Show that S = Sj U Sa U ■ • •
generates W.
4.81. Let V be the vector space of wsquare matrices over a field K. Let U be the subspace of upper
triangular matrices and W the subspace of lower triangular matrices. Find (i) U + W, (ii) UnW.
4.82. Let V be the external direct sum of the vector spaces U and W over a field K. (See Problem 4.45.)
Let A ys.
U = {(m,0): uGU}, W = {(0,w): w & W}
Show that (i) U and W are subspaces of V, (ii) V = U ® W.
Answers to Supplementary Problems
4.47. (i) Yes. (iv) Yes.
(ii) No;e.g. (1,2,3) GW but 2(1, 2, 3) € W'. (v) No; e.g. (9,3,0)eW' but 2(9,3,0) ^W.
(iii) No;e.g. (1, 0, 0), (0, 1, 0) G T7, (vi) Yes.
but not their sum.
4.50. (1) Let f,g GW with M^ and Mg bounds for / and g respectively. Then for any scalars a, 6 G R,
\(af+bg)(x)\ = \af(x) + bg(x)\ ^ \af(x)\ + \bg(x)\ = a /(*) + 6 ff(a;) ^ \a\Mf+\b\Mg
That is, \a\Mf + \b\Mg is a bound for the function af + bg.
(ii) (af + bg)(x) = af(x) + bg(x) = af(x) + bg(x) = (af + bg)(x}
4.51. No. Although one may "identify" the vector (a, b) G R2 with, say, (a, b, 0) in the xy plane in R3,
they are distinct elements belonging to distinct, disjoint sets.
4.54. (i) 3m + 2v. (ii) Impossible, (iii) k = 8. (iv) o  36 — 5c = 0.
4.55. (i) u — 2v — w. (ii) Impossible.
4.56. (i) E = 2A B + 2C. (ii) Impossible.
4.61. (2, 5, 0).
4.66. A and C.
4.67. Form the matrix A whose rows are the Mj and the matrix B whose rows are the Uj, and then show
that A and B have the same row canonical forms.
4.74. (i) InR2,let S = {(0,0), (0,1), (0,2), (0,3), . . .}.
(ii) InR2, let S = {(0,5), (0,6), (0,7), ...}.
4.77. The sum is direct in (ii) and (iii).
4.78. Hint. f(x) = ^(f(x) + f(x)) + ^f(x)  f(x)), where ^(f(x) + /(»)) is even and ^(f(x)  f(x))
is odd.
4.81. (i) V = U + W. (ii) J7nW is the space of diagonal matrices.
chapter 5
Basis and Dimension
INTRODUCTION
Some of the fundamental results proven in this chapter are:
(i) The "dimension" of a vector space is well defined (Theorem 5.3).
(ii) If V has dimension n over K, then V is "isomorphic" to K" (Theorem 5.12).
(iii) A system of linear equations has a solution if and only if the coefficient and
augmented matrices have the same "rank" (Theorem 5.10).
These concepts and results are nontrivial and answer certain questions raised and investi
gated by mathematicians of yesterday.
We will begin the chapter with the definition of linear dependence and independence.
This concept plays an essential role in the theory of linear algebra and in mathematics in
general.
LINEAR DEPENDENCE
Definition: Let F be a vector space over a field Z. The vectors vi, . . .,Vm&V are said
to be linearly dependent over K, or simply dependent, if there exist scalars
ai, . . .,am&K, not all of them 0, such that
aiVi + aiVi + • • ■ + dmVm = (*)
Otherwise, the vectors are said to be linearly independent over K, or simply
independent.
Observe that the relation (*) will always hold if the a's are all 0. If this relation holds
only in this case, that is,
aiVi + a^Vi + • • • + OmVm  only if ai = 0, . . . , Om =
then the vectors are linearly independent. On the other hand, if the relation (*) also holds
when one of the a's is not 0, then the vectors are linearly dependent.
Observe that if is one of the vectors vi, ...,Vm, say vi = 0, then the vectors must be
dependent; for
Ivi + Ov2 + • • • + Ot;m = 1 • + + • • • + =
and the coefficient of Vi is not 0. On the other hand, any nonzero vector v is, by itself,
independent; for , ^ ,a • t i. a
*^ kv = 0, V ¥= implies A; =
Other examples of dependent and independent vectors follow.
Example 5.1: The vectors m = (1,1,0), 1; = (1,3,1) and w = (5, 3, 2) are dependent since,
for 3m + 2v — w = 0,
3(1, 1, 0) + 2(1, 3, 1)  (5, 3, 2) = (0, 0, 0)
86
CHAP. 5] BASIS AND DIMENSION 87
Example 5.2: We show that the vectors u = (6, 2, 3, 4), v = (0, 5, 3, 1) and w = (0, 0, 7, 2)
are independent. For suppose xu + yv + zw = where x, y and z are unknown
scalars. Then
(0, 0, 0, 0) = x{6, 2, 3, 4) + y{0, 5, 3, 1) + z{0, 0, 7, 2)
= (6a;, 2x + 5y, 3xSy + Iz, Ax + y 2z)
and so, by the equality of the corresponding components,
6a; =0
2x + hy =0
ZxZy + lz 
4a; + y — 2z =
The first equation yields a; = 0; the second equation with x = yields y = 0; and
the third equation with a; = 0, y — Q yields 2 = 0. Thus
xu't yv + zw = implies x = 0, y = 0, z =
Accordingly u, v and w are independent.
Observe that the vectors in the preceding example form a matrix in echelon form:
Thus we have shown that the (nonzero) rows of the above echelon matrix are independent.
This result holds true in general; we state it formally as a theorem since it will be frequently
used.
Theorem 5.1: The nonzero rows of a matrix in echelon form are linearly independent.
For more than one vector, the concept of dependence can be defined equivalently as
follows:
The vectors Vi, . . .,Vm are linearly dependent if and only if one of them is a linear
combination of the others.
For suppose, say, Vi is a linear combination of the others:
Vi = aiVi + • • • + UiiVii + tti+iVi + i + • • • + UrnVm
Then by adding —Vi to both sides, we obtain
aiVl + • • • + OiiVii — Vi + Ui + lVi + i + • • • + amVm —
where the coefficient of Vi is not 0; hence the vectors are linearly dependent. Conversely,
suppose the vectors are linearly dependent, say,
biVi + • • • + bjVj + • • • + bmVm = where bj ¥
Then Vj ~ —bi^biVi — ••• — bf^bjiVji — bi^bj+iVj+i — ••• — bi^bmVm
and so Vj is a linear combination of the other vectors.
We now make a slightly stronger statement than that above; this result has many im
portant consequences.
Lemma 5.2: The nonzero vectors Vi, . . .,Vm are linearly dependent if and only if one of
them, say vt, is a linear combination of the preceding vectors:
Vi — kiVi + kiVi + • • • + fciiVii
88
BASIS AND DIMENSION
[CHAP. 5
Remark 1. The set [vi, . . .,Vm} is called a dependent or independent set according as the
vectors vi, . . .,Vm are dependent or independent. We also define the empty
set to be independent.
Remark 2. If two of the vectors Vi, . . .,Vm are equal, say vi  vz, then the vectors are
dependent, For , a., , . n.,  n
and the coefficient of i;i is not 0.
Remark 3. Two vectors Vi and v^ are dependent if and only if one of them is a multiple of
the other.
Remark 4. A set which contains a dependent subset is itself dependent. Hence any
subset of an independent set is independent.
Remark 5. If the set {vu . . . , Vm} is independent, then any rearrangement of the vectors
{Vij, Vi^, . . . , Vi„} is also independent.
Remark 6. In the real space R^ dependence of vectors can be described geometrically as
follows: any two vectors u and v are dependent if and only if they lie on the
same line through the origin; and any three vectors u, v and w are dependent
if and only if they lie on the same plane through the origin:
u and V are dependent.
u, V and w are dependent.
BASIS AND DIMENSION
We begin with a definition.
Definition: A vector space V is said to be of finite dimension n or to be ndimensional,
written dim V = n, if there exists linearly independent vectors ei, e2, . . . , e„
which span V. The sequence {ei, 62, ... , e„) is then called a basis of V.
The above definition of dimension is well defined in view of the following theorem.
Theorem 5.3: Let F be a finite dimensional vector space. Then every basis of V has the
same number of elements.
The vector space {0} is defined to have dimension 0. (In a certain sense this agrees with
the above definition since, by definition, is independent and generates {0}.) When a
vector space is not of finite dimension, it is said to be of infinite dimension.
Example 5.3: Let K be any field. Consider the vector space K" which consists of ntuples of ele
ments of K. The vectors m n n n n\
ei = (1, 0, 0, . . ., 0, 0)
62 = (0,1,0, ...,0,0)
e„ = (0,0,0 0,1)
form a basis, called the usual basis, of K". Thus K" has dimension n.
CHAP. 5]
BASIS AND DIMENSION
89
Example 5.4: Let U be the vector space of all 2 X 3 matrices over a field K. Then the matrices
/I 0\ /O 1 ON /O 1\
Vo 0/' i^o 0/' Vo 0/'
(
10
10
1
Example 5.5:
form a basis of U. Thus dim C/ = 6. More generally, let V be the vector space
of all m X % matrices over K and let E^ S y be the matrix with lyentry 1 and
elsewhere. Then the set {ffy} is a basis, called the usual basis, of V (Problem 5.32);
consequently dim V — mn.
Let W be the vector space of polynomials (in t) of degree — n. The set {1, t,t^, . . ., t"}
is linearly independent and generates W. Thus it is a basis of W and so
dim W = n+1.
We comment that the vector space V of all polynomials is not finite dimensional
since (Problem 4.26) no finite set of polynomials generates V.
The above fundamental theorem on dimension is a consequence of the following im
portant "replacement lemma":
Lemma 5,4: Suppose the set {vi, V2, . . ., Vn} generates a vector space V. If {wi, . . ., Wm}
is linearly independent, then m — n and V is generated by a set of the form
{Wi, ...,Wm, Vij, . . . , Vi^_J
Thus, in particular, any % + 1 or more vectors in V are linearly dependent.
Observe in the above lemma that we have replaced m of the vectors in the generating
set by the m independent vectors and still retained a generating set.
Now suppose S is a subset of a vector space V. We call {vi, . . ., Vm} a maximal in
dependent subset of S if:
(i) it is an independent subset of S; and
(ii) {vi, . . .,Vm,w} is dependent for any w e S.
The following theorem applies.
Theorem 5.5: Suppose S generates V and {vi, . . ., Vm} is a maximal independent subset
of S. Then (vi, . . . , Vm} is a basis of V.
The main relationship between the dimension of a vector space and its independent
subsets is contained in the next theorem.
Theorem 5.6: Let V be of finite dimension n. Then:
(i) Any set of n + 1 or more vectors is linearly dependent.
(ii) Any linearly independent set is part of a basis, i.e. can be extended to
a basis,
(iii) A linearly independent set with n elements is a basis.
Example 5.6: The four vectors in K*
(1,1,1,1), (0,1,1,1), (0,0,1,1), (0,0,0,1)
are linearly independent since they form a matrix in echelon form. Furthermore,
since dim K* = 4, they form a basis of K*.
Example 5.7: The four vectors in R3,
(257,132,58), (43,0,17), (521,317,94), (328,512,731)
must be linearly dependent since they come from a vector space of dimension 3.
90
BASIS AND DIMENSION
[CHAP. 5
DIMENSION AND SUBSPACES
The following theorems give basic relationships between the dimension of a vector space
and the dimension of a subspace.
Theorem 5.7: Let W he a subspace of an ndimension vector space V. Then dim W n.
In particular if dim W = n, then W — V.
Example 5.8: Let W be a subspace of the real space B?. Now dim R^ = 3; hence by the preced
ing theorem the dimension of W can only be 0, 1, 2 or 3. The following cases apply:
(i) dim W  0, then W = {0}, a point;
(ii) dim W = 1, then W is a line through the origin;
(iii) dim W = 2, then T^ is a plane through the origin;
(iv) dim W — S, then W is the entire space R^.
Theorem 5.8: Let U and W be finitedimensional subspaces of a vector space V. Then
U + W has finite dimension and
dim(C7 + F) = dim f/ + dim TF  dim{UnW)
Note that if V is the direct sum of U and W, i.e. V = U ® W, then dim V =
dim U + dim W (Problem 5.48).
Example 5.9: Suppose U and W are the xy plane and yz plane, respectively, in R^: U = {(a, 6,0)},
W = {(0, 6, c)}. Since RS = t/ + I^, dim {U+W) = B. Also, dim 17 = 2 and
dim W — 2. By the above theorem,
3 = 2 + 2dim(f/nTF) or dim{UnW) = 1
Observe that this agrees with the fact that UnW is the y axis, i.e. UnW =
{(0, 6, 0)}, and so has dimension 1.
z
w
^Vr\W
^,/^ V
^^0 ^^
..^
r ■
RANK OF A MATRIX
Let A be an arbitrary m x w matrix over a field K. Recall that the row space of A is
the subspace of K" generated by its rows, and the column space of A is the subspace of R^
generated by its columns. The dimensions of the row space and of the column space of A
are called, respectively, the row rank and the column rank of A.
Theorem 5.9: The row rank and the column rank of the matrix A are equal.
Definition: The rank of the matrix A, written rank (A), is the common value of its row
rank and column rank.
Thus the rank of a matrix gives the maximum number of independent rows, and also
the maximum number of independent columns. We can obtain the rank of a matrix as
follows.
/I 2 1\
Suppose A = 2 6 3 3 . We reduce A to echelon form using the elementary
row operations: \ 3 10—6 —5/
1
2
1
1
2
1
2
3
1
to
2
3
1
4
6
2
\o
CHAP. 5] BASIS AND DIMENSION 91
A to
Recall that row equivalent matrices have the same row space. Thus the nonzero rows of the
echelon matrix, which are independent by Theorem 5.1, form a basis of the row space of
A. Hence that rank of A is 2.
APPLICATIONS TO LINEAR EQUATIONS
Consider a system of m linear equations in n unknowns a;i, . . . , «„ over a field K:
ttuXl + ai2X2 + • • • + ainXn = bi
CziaJi + a22X2 + • • • + ChnXn = ^2
an
ai2
ttln
bi
a2i
022
. a2n
62
ttml
ftm2
. . dmn
hm
dmlXi + am2X2 + • • • + OmnXn — &m
or the equivalent matrix equation
AX = B
where A — (an) is the coefficient matrix, and X = (xi) and B = (6i) are the column vectors
consisting of the unknowns and of the constants, respectively. Recall that the augmented
matrix of the system is defined to be the matrix
{A,B)
Remark 1. The above linear equations are said to be dependent or independent according
as the corresponding vectors, i.e. the rows of the augmented matrix, are
dependent or independent.
Remark 2. Two systems of linear equations are equivalent if and only if the corresponding
augmented matrices are row equivalent, i.e. have the same row space.
Remark 3. We can always replace a system of equations by a system of independent
equations, such as a system in echelon form. The number of independent
equations will always be equal to the rank of the augmented matrix.
Observe that the above system is also equivalent to the vector equation
Thus the system AX = B has a solution if and only if the column vector B is a linear
combination of the columns of the matrix A, i.e. belongs to the column space of A. This
gives us the following basic existence theorem.
Theorem 5.10: The system of linear equations AX — B has a solution if and only if the
coefficient matrix A and the augmented matrix (A, B) have the same rank.
92
BASIS AND DIMENSION
[CHAP. 5
Recall (Theorem 2.1) that if the system AX = B does have a solution, say v, then its
general solution is of the form v + W = {v + w: wGW} where W is the general solution
of the associated homogeneous system AX = 0. Now W is a subspace of K" and so has a
dimension. The next theorem, whose proof is postponed until the next chapter (page 127),
applies.
Theorem 5.11: The dimension of the solution space W of the homogeneous system of linear
equations AX = is nr where n is the number of unknowns and r is
the rank of the coefficient matrix A.
In case the system AX = is in echelon form, then it has precisely nr free variables
(see page 21), say, Xi^.xi^, . . .,Xi^_^. Let Vj be the solution obtained by setting aji^. = 1,
and all other free variables = 0. Then the solutions Vi, . . ., v^r are linearly independent
(Problem 5.43) and so form a basis for the solution space.
Example 5.10: Find the dimension and a basis of the solution space W of the system of linear
equations
x + 2y Az + 3r s =
X + 2y 2z + 2r+ s 
2x + iy 2z + 3r + 4s =
Reduce the system to echelon form:
X + 2y  Az + 3r — s =
2z — r + 2s = and then
6z  3r + 68 =
X + 2y
4z + 3r  s =
2z  r + 2s =
There are 5 unknowns and 2 (nonzero) equations in echelon form; hence dim W 
5 — 2 = 3. Note that the free variables are y, r and s. Set:
(i) J/ = 1, r = 0, s = 0, (ii) 3/ = 0, r = 1, s = 0, (iii) y = 0, r = 0, s = l
to obtain the following respective solutions:
vi = (2,1,0,0,0), V2 = (1, 0, f 1, 0), vs = (3,0,1,0,1)
The set {^i, V2, V3} is a basis of the solution space W.
COORDINATES
Let {ei, . . . , e„} be a basis of an ndimensional vector space V over a field K, and let v
be any vector in V. Since {ei} generates V, 1; is a linear combination of the d:
:K
ttiCi + aiCi +
+ dnen, OH
Since the d are independent, such a representation is unique (Problem 5.7), i.e. the n
scalars ai, . . ., a„ are completely determined by the vector v and the basis {ei}. We call
these scalars the coordinates of v in {ei}, and we call the «tuple (ai, . . ., a„) the coordinate
vector of v relative to {ei} and denote it by [v]e or simply [v]:
[V]e — (tti, ai, . . ., On)
Example 5.11 : Let V be the vector space of polynomials with degree ^ 2:
V  {af2 + 5t + c : a, 6, c e R}
The polynomials
ei = 1, 02 = t1 and 63 = (t 1)2 = i^  2t + 1
form a basis for V. Let v = 2t25t + 6. Find [v\„ the coordinate vector of v
relative to the basis {cj, e^, 63}.
CHAP. 5] BASIS AND DIMENSION 93
Set V as a linear combination of the e; using the unknowns x, y and z: v = xe^ +
yCi + zeg.
2«2  5« + 6 = x(\) + y(t  1) + z(f2  2« + 1)
= a; + 3/f  3/ + «t2 _ 2zt + z
= zfi + (y 2z)t + (xy + z)
Then set the coefficients of the same powers of t equal to each other:
X — y + z — 6
y 2z = 5
z = 2
The solution of the above system is a: = 3, j/ = —1, z = 2. Thus
V = 3ei  62 + 2e3, and so [v]^ = (3, 1, 2)
Example 5.12: Consider the real space tt?. Find the coordinate vector of r = (3, 1, 4) relative to
the basis /i = (1, 1. 1), /a = (0, 1, 1), /s = (0, 0, 1).
Set r as a linear combination of the /; using the unknowns x, y and z: v = xfi +
(3,1,4) = a;(l, 1, 1) + j/(0, 1, 1) + z(0, 0, 1)
^ (x, X, x) + (0, y, y) + (0, 0, 2)
= (x,x + y,x\y + z)
Then set the corresponding components equal to each other to obtain the equivalent
system of equations
X =3
X + y =1
X \ y + z = —4
having solution x = 3, y = 2, z = 5. Thus [v]f = (3, 2, 5).
We remark that relative to the usual basis e^ = (1, 0, 0), ej = (0, 1, 0), eg =
(0, 0, 1), the coordinate vector of v is identical to v itself: [v]^ = (3, 1, 4) = v.
We have shown above that to each vector vGV there corresponds, relative to a given
basis {ei, . . ., e„}, an ntuple [v]e in K\ On the other hand, if (ai, . . .,a„) G j?«, then there
exists a vector in V of the form aiCi + • • • + a„e„. Thus the basis {d} determines a oneto
one correspondence between the vectors in V and the wtuples in K". Observe also that if
V = ttiei + • • • + a„e„ corresponds to (ai, . . . , a„)
an<i w = biBi + ■ • • + 6„e„ corresponds to (&i, . . . , 6„)
then
t; + i« = (ai + &,)ei + • • • + (a„ + 6„)e„ corresponds to (ai, . . . , a„) + (bi, . . . , b„)
and, for any scalar k G K,
kv = (A;ai)ei + • • • + {kan)e„ corresponds to k{ai, . . . , a„)
That is, [v + w]e = M« + [w]e and [kv]e = k[v]e
Thus the above onetoone correspondence between V and K" preserves the vector space
operations of vector addition and scalar multiplication; we then say that V and K" are
isomorphic, written V ^ K"". We state this result formally.
Theorem 5.12: Let V be an «dimensional vector space over a field K. Then V and K^ are
isomorphic.
94 BASIS AND DIMENSION [CHAP. 5
The next example gives a practical application of the above result.
Example 5.13: Determine whether the following matrices are dependent or independent:
^=a I'D "^G
3 4\ c = f ^ ^ ~"
6 5 AJ' U6 10 9
The coordinate vectors of the above matrices relative to the basis in Example
5.4, page 89, are
[A] = (1,2,3,4,0,1), [B] = (1,3,4,6,5,4), [C] = (3,8,11,16,10,9)
Form the matrix M whose rows are the above coordinate vectors:
M =
Row reduce M to echelon form:
40l\ /123 40
M to I 11 2 5 3 to 11 2 5
4 10 6/ \00000
Since the echelon matrix has only two nonzero rows, the coordinate vectors [A], [B]
and [C] generate a space of dimension 2 and so are dependent. Accordingly, the
original matrices A, B and C are dependent.
Solved Problems
LINEAR DEPENDENCE
5.1. Determine whether or not u and v are linearly dependent if:
(i) u = (3,4), V = (1,3) (iii) u  (4,3,2), v = (2,6,7)
(ii) u = (2,3), V = (6,9) (iv) u = (4,6,2), v = (2,3,1)
(V) « = /l2 4N /24 8\ (,i), = /l 23N ^/65 4
' [so 1/ [g 02/ ^6 5 i) [l 2 3
(vii) M = 2  5* + 6*2  i3, V = 3 + 2t4t^ + 5t^
(viii) u = lSt + 2t^ St\ V = S + 9t6t^ + 9«'
3
Two vectors u and v are dependent if and only if one is a multiple of the other.
(i) No. (ii) Yes; for v = 3m. (iii) No. (iv) Yes; for u = —2v. (v) Yes; for v = 2m. (vi) No.
(vii) No. (viii) Yes; for v = —3m.
5.2. Determine whether or not the following vectors in R^ are linearly dependent:
(i) (1,2,1), (2, 1,1), (7, 4,1) (iii) (1,2,3), (1,3,2), (2,1,5)
(ii) (1, 3, 7), (2, 0, 6), (3, 1, 1), (2, 4, 5) (iv) (2, 3, 7), (0, 0, 0), (3, 1, 4)
(i) Method 1. Set a linear combination of the vectors equal to the zero vector using unknown
scalars x, y and z:
x(l, 2, 1) + 2/(2, 1, 1) + 2(7, 4, 1) = (0, 0, 0)
CHAP. 5] BASIS AND DIMENSION 95
Then (x, 2x, x) + (2v, y, y) + (Iz, 4z, z) = (0, 0, 0)
0^ (x + 2y + 7z, 2x + y  4:Z, x  y + z)  (0, 0, 0)
Set corresponding components equal to each other to obtain the equivalent homogeneous system,
and reduce to echelon form:
X + 2y + 7z  X + 2y + 7z =
o, .„ X + 2y + 7z =
2x + 2/  4z = or 5j/ + lOz = or
y\2z =
X y + z = 3j/  6z =
The system, in echelon form, has only two nonzero equations in the three unknowns; hence the
system has a nonzero solution. Thus the original vectors are linearly dependent.
Method 2. Form the matrix whose rows are the given vectors, and reduce to echelon form using
the elementary row operations:
to 53 to
Since the echelon matrix has a zero row, the vectors are dependent. (The three given vectors
generate a space of dimension 2.)
(ii) Yes, since any four (or more) vectors in R3 are dependent.
(iii) Form the matrix whose rows are the given vectors, and row reduce the matrix to echelon form:
to
Since the echelon matrix has no zero rows, the vectors are independent. (The three given vectors
generate a space of dimension 3.)
(iv) Since = (0, 0, 0) is one of the vectors, the vectors are dependent.
5.3. Let V be the vector space of 2 x 2 matrices over R. Determine whether the matrices
A,B,C GV are dependent where:
(i) Set a linear combination of the matrices A, B and C equal to the zero matrix using unknown
scalars x, y and z; that is, set xA + yB + zC = 0. Thus:
C :) HID c :) ^ c :)
/x + y + z x + z\ _ f^ ^\
\ X x + yj ~ \0 0)
96 BASIS AND DIMENSION [CHAP. 5
Set corresponding entries equal to each other to obtain the equivalent homogeneous system of
equations:
X + y + z =
X + z =
X =0
X + y =0
Solving the above system we obtain only the zero solution, x = 0, y = 0, z = 0. We have
shown that xA + yB + zC implies a; = 0, y = 0, z = 0; hence the matrices A,B and C are
linearly independent.
(ii) Set a linear combination of the matrices A,B and C equal to the zero vector using unknown
scalars x, y and z; that is, set xA + yB + zC = 0. Thus:
'1 2\ /3 1\ / 1 5\ /O ON
/ X 2x\ /Zy y\ / z 5z\ _ /O
°^ \3x x) '^ \2y 2y) '^ \Az / ~ 1,0
X + 3y + z 2xy  5z\ /O
3x + 2y 4:Z x + 2y J ~ \0
Set corresponding entries equal to each other to obtain the equivalent homogeneous system of
linear equations and reduce to echelon form:
X + 3y + z =
X
+ 3y+z
=
2x  y — 5z 
ny  7z
=
3x + 2y 4z =
or
ly  7z
=
x + 2y =0
y — z
=
X + Sy + z

y
+ z
=
or finally
The system in echelon form has a free variable and hence a nonzero solution, for example,
X = 2, y = 1, z = 1. We have shown that xA + yB + zC = does not imply that x = 0,
j^ = 0, z = 0; hence the matrices are linearly dependent.
5.4. Let V be the vector space of polynomials of degree ^ 3 over R. Determine whether
u,v,w gV are independent or dependent where:
(i) u ^ i^Sf^ + Bt + l, V = t^t^ + St + 2, w = 2*8  4*2 + 9t + 5
(ii) u = 1^ + 4t^2t + S, V = t^ + 6t^t + 4, w = B1^ + St^  St + 7
(i) Set a linear combination of the polynomials u,v and w equal to the zero polynomial using
unknown scalars x, y and z; that is, set xu + yv + zw = 0. Thus:
x(t»  3*2 + 5t + l) + y(t^  t2 + 8t + 2) + z{2fi U^ + 9t + 5) =
or xfi  3xt^ + 5xt + x + yt^  yt^ + Syt + 2y + 2zt»  izt^ + 9zt + 5z =
or (x + y + 2z)t? + {3x y iz)t^ + (5* + 8y + 9z)t + {x + 2y + 5z) =
The coefficients of the powers of t must each be 0:
X + y + 2z =
—3a; — 2/ — 4z =
5x + 8y + 9z 
x + 2y + 5z =
Solving the above homogeneous system, we obtain only the zero solution: x = 0, y = 0, z = 0;
hence u, v and w are independent.
CHAP. 5] BASIS AND DIMENSION 97
(ii) Set a linear combination of the polynomials u,v and w equal to the zero polynomial using
unknown scalars x, y and z\ that is, set xu + yv + zw = 0. Thus:
x{ti + 4«2  2t + 3) + 3/(t3 + 6f2  f + 4) + 2(3*3 + §(2 _ gt + 7) =
or xt^ + 4a;t2  2xt + Zx + yt» + 6yfi  yt + ^y + 3zt^ + izt^  Szt + 7z =
or (x + y + 3z)i3 + (4a; + 62/ + Sz)t^ + (2x y 8z)t + {3x + 4y + 7z) =
Set the coefficients of the powers of t each equal to and reduce the system to echelon form:
X + y + 3z = X + y + 3z =
4a; + 62/ + 82 = 2y  4:Z 
or
2x  y  8z = y 2z =
3x + Ay + 7z = y  2z =
or finally a; + 3/ + 82 =
y 2z =
The system in echelon form has a free variable and hence a nonzero solution. We have shown
that xu + yv + zw = does not imply that x = 0, y = 0, z  0; hence the polynomials are
linearly dependent.
5.5. Let V be the vector space of functions from R into R. Show that f,g,hGV are
independent where: (i) f{t) = e^, g{t) = t\ h{t) = t; (ii) f{t) = sint, g{t) = cost,
h{t) — t.
In each case set a linear combination of the functions equal to the zero function using unknown
scalars x, y and z: xf \ yg + zh = 0; and then show that a; = 0, j/ = 0, z = 0. We emphasize that
xf + yg \ zh  {) means that, for every value of t, xf{t) + yg(t) + zh(t) = 0.
(i) In the equation xe^* + yt^ + zt = 0, substitute
t = to obtain xe" + j/0 + zO = or a; =
* = 1 to obtain xe^ + y \ z =0
t = 2 to obtain xe* + 4y + 2z =
Solve the system 
X =0
xe^ + y + z = to obtain only the zero solution: x = 0, y = 0, z — 0.
xe* + Ay + 2z =
Hence /, g and h are independent.
(Ii) Method 1. In the equation a; sin t + 3/ cos « + zt = 0, substitute
* = to obtain xQ + y •! + z'Q = Q or y = Q
t = tt/2 to obtain a; • 1 + j/ • + Zff/2 = or a; + 7rzl2 —
t = TT to obtain a; • + y(—l) + z • jr = or —y + Trz =
V =
X + jrz/2 = to obtain only the zero solution: x = 0, y = 0, z = 0. Hence
y + vz =
/, g and h are independent.
Solve the system
Method 2. Take the first, second and third derivatives of x sin t + 2/ cos t + z« = with
respect to t to get
X cos t — 2/ sin t + z = (1)
— X sin t — y cos t = (2)
— X cos i + 2/ sin t = (5)
9g BASIS AND DIMENSION [CHAP. 5
Add (1) and (5) to obtain z = 0. Multiply (2) by sin t and (3) by cos t, and then add:
sin t X (2): — a; sin^ t — y sin t cost =
cos t X (3): —X cos^ t + y sint cos t =
x(sin2 1 + cos2 f) = or a; =
Lastly, multiply (2) by — cos t and (5) by sin t; and then add to obtain
2/(cos2 e + sin2 = or y =
Since x sint + y cost + zt — implies a; = 0, j/ = 0, a =
/, fir and h are independent.
5.6. Let u, V and w be independent vectors. Show that u + v, uv and u2v + w are
also independent.
Suppose x(u + v) + y(u v) + z(u2v + w)  where x, y and z are scalars. Then
xu + XV + yu — yv + zu — 2zv + «w = or
{x + y + z)u+ (x — y — 2z)v + zw =
But u, V and w are linearly independent; hence the coefficients in the above relation are each 0:
X + y + z =
xy 2z =
2 =
The only solution to the above system is x = 0, y = 0, z  0. Hence u + v, uv and u2v + w are
independent.
5.7. Let vi, V2, . ..,Vm be independent vectors, and suppose m is a linear combination of
the Vi, say u = aiVi + onVz + • • • + OmVm where the ai are scalars. Show that the
above representation of u is unique.
Suppose u = biVi + b2V2 + • • • + b„v„ where the 6j are scalars. Subtracting,
= u — u = (»! — bi)vi + (a2 — 62)^2 + ■ ■ ■ + («m ~ bm)Vm
But the Vi are linearly independent; hence the coefficients in the above relation are each 0:
ai — bi = 0, a2— b2 = 0, .. ., «« ~" *m =
Hence a, = 61, 02 = 62, ..., a^ = b^ and so the above representation of m as a linear combination
of the Vi is unique.
5.8. Show that the vectors v = (l+i, 2i) and w = {l,l+i) in C^ are linearly dependent
over the complex field C but are linearly independent over the real field R.
Recall that 2 vectors are dependent iff one is a multiple of the other. Since the first coordinate
of w is 1, I* can be a multiple of w iff v = (l + i)w. But 1 + i € R; hence v and w are independent
overR. Since (i+i)w = (l + i)(l, 1 + i) = (1 + 1, 2i) = v
and 1 + t G C, they are dependent over C.
5.9. Suppose S = {vi, ...,Vm} contains a dependent subset, say {vi, ...,Vr}. Show that
S is also dependent. Hence every subset of an independent set is independent.
Since {v^, ...,v^} is dependent, there exist scalars a^, ...,a^, not all 0, such that
a^vi + a^v^ +
CHAP. 5] BASIS AND DIMENSION 99
Hence there exist scalars aj flr. 0, . . . , 0, not all 0, such that
aiVi + • • • + a^Vr + O^r+j + • • ■ + Ov^ =
Accordingly, S is dependent.
5.10. Suppose {vi, . . .,Vm} is independent, but {vi, . . .,Vm,w} is dependent. Show that w
is a linear combination of the i;;.
Method 1. Since {vj, . . . , v^, w} is dependent, there exist scalars «!,..., «„. ^. "o* *11 0, such that
Oj^x + • • • + (im'"m + 6w = 0. If 6 = 0, then one of the aj is not zero and Oi^yj + • • • + a^v^ = 0.
But this contradicts the hypothesis that {v^, .... i>„} is independent. Accordingly, 6 # and so
w = 6i(aiVi a„0 = b^a^Vi  ■ ■ ■  b^a^Vm
That is, w is a linear combination of the Vi.
Method 2. If w = 0, then w = Ovi + • • • + Ov^. On the other hand, if w ^^ then, by Lemma
5.2, one of the vectors in {dj, . . . , v^, w} is a linear combination of the preceding vectors. This
vector cannot be one of the v's since {vj, . . . , v^} is independent. Hence w is a linear combination
of the v^.
PROOFS OF THEOREMS
5.11. Prove Lemma 5.2: The nonzero vectors Vi, . . .,Vm are linearly dependent if and only
if one of them, say vi, is a linear combination of the preceding vectors: Vi =
ait;i + • • • + <hiVii.
Suppose Vj = aj'Ui + ••• + ajxyji. Then
aiVi + • • ■ + ai_iDi_i  Vi + Oi>j+i + • • • + Ov,n =
and the coefficient of vi is not 0. Hence the Vj are linearly dependent.
Conversely, suppose the Vj are linearly dependent. Then there exist scalars Oi, . . . , a^, not all
0, such that ai^i + ■ • • + a^Vm = 0. Let k be the largest integer such that o^ ¥= 0. Then
a^Vi + • • • + Ofc'Ufc + Ot)fc + i + • ■ • + Ov^ — or a^Vi + • • • + a^v^ —
Suppose fe = 1; then a^Vi = 0, a^ ¥= and so v^ = 0. But the Vi are nonzero vectors; hence
k > 1 and
That is, y^ is a linear combination of the preceding vectors.
5.12. Prove Theorem 5.1: The nonzero rows Ri, . . .,Rn of a matrix in echelon form are
linearly independent.
Suppose {RntRni, ■ ■ ■,Ri} is dependent. Then one of the rows, say R^, is a linear combination
of the preceding rows:
^m — "m+l^m + l + «m + 2'^m + 2 + ■"■ + <''n^n (*)
Now suppose the fcth component of R^ is its first nonzero entry. Then, since the matrix is in echelon
form, the fcth components of Rm+u ■ ■ y^n ^^^ ^^ 0, and so the feth component of (*) is a^+i* +
"m+2* + • • • + Ore* = 0. But this contradicts the assumption that the feth component of B„ is
not 0. Thus i?i, . . .,Rn are independent.
5.13. Suppose {vi, . . ., Vm} generates a vector space V. Prove:
(i) If w GV, then {w, vi, . . ., Vm) is linearly dependent and generates V.
(ii) If V{ is a linear combination of the preceding vectors, then [vi, . . .,ViuVi+i, . . ., Vm]
generates V.
(i) If w e y, then w is a linear combination of the Vj since {iij} generates V. Accordingly,
{w, Vi, . . ., Vj^} is linearly dependent. Clearly, w with the Vj generate V since the Vf by them
selves generate V. That is, {w,Vi, . . .,v^} generates V.
100 BASIS AND DIMENSION [CHAP. 5
(ii) Suppose Vi = k^Vi + • • • + fei_i'yj_i. Let uGV. Since {Vii generates V, m is a linear com
bination of the Vi, say, u  a^Vi + • • ■ + a^v^. Substituting for Vj, we obtain
u = ttiVj + • • • + ai_ii;j_i + aj(fciVi + • • • + ki^^Vii) + aj+iVj+i + • • ■ + a^Vm
 {ai + aiki)vi + • • • + {ai^i + a^k^_l)v^_l + a^+iVi + i + • • ■ + a^Vm
Thus {vi, . . ..t'ii, 1^1+ 1, • • •,^'ot} generates V. In other words, we can delete Vj from the gen
erating set and still retain a generating set.
5.14. Prove Lemma 5.4: Suppose {vi, . . .,Vn} generates a vector space V. If [wi, . . .,Wm}
is linearly independent, then m — n and V is generated by a set of the form
{wi, . . .,Wm,vi^, . . .,Vi^_^}. Thus, in particular, any » + l or more vectors in V are
linearly dependent.
It suffices to prove the theorem in the case that the V; are all not 0. (Prove!) Since the {Vj}
generates V, we have by the preceding problem that
{wi, Vi, ..., yj {1)
is linearly dependent and also generates V. By Lemma 5.2, one of the vectors in (1) is a linear com
bination of the preceding vectors. This vector cannot be Wj, so it must be one of the v's, say Vj.
Thus by the preceding problem we can delete Vj from the generating set {!) and obtain the generating
Now we repeat the argument with the vector Wj That is, since (2) generates V, the set
{Wi, W2, Vi, . ..,Vji,Vj+i, ..., vJ (5)
is linearly dependent and also generates V. Again by Lemma 5.2, one of the vectors in (3) is a linear
combination of the preceding vectors. We emphasize that this vector cannot be Wj or Wj since
{wi, . . . , w^} is independent; hence it must be one of the v's, say v^. Thus by the preceding problem
we can delete v^ from the generating set (3) and obtain the generating set
{Wi, W2, Vi, . . ., ■Uj_i, Hj+i, . . ., Vfc_l, ■Vfc + l, . . ., vJ
We repeat the argument with Wg and so forth. At each step we are able to add one of the
w's and delete one of the ■y's in the generating set. If m ^ n, then we finally obtain a generating
set of the required form:
{Wl, ..., Wm. V ■••'■"*nm^
Lastly, we show that m > n is not possible. Otherwise, after n of the above steps, we obtain
the generating set {wi, . . ., w„}. This implies that w„+i is a linear combination of Wj, . . ., w„ which
contradicts the hypothesis that {wj} is linearly independent.
5.15. Prove Theorem 5.3: Let 7 be a finite dimensional vector space. Then every basis of
V has the same number of vectors.
Suppose {ei,e2, ...,e„} is a basis of V, and suppose {fufi, • • •) is another basis of V. Since
{e,} generates V, the basis {fi.fz, • • •} must contain n or less vectors, or else it is dependent by the
preceding problem. On the other hand, if the basis {fiJ^, ■ ■ ■} contains less than n vectors, then
{ej ej is dependent by the preceding problem. Thus the basis {fiJz, ■••} contains exactly n
vectors, and so the theorem is true.
5.16. Prove Theorem 5.5: Suppose {vi, ...,Vm} is a maximal independent subset of a set
S which generates a vector space V. Then [vi, . ..,Vm} is a basis of V.
Suppose weS. Then, since {vj} is a maximal independent subset of S, {■Ui, ...,■«„, w} is
linearly dependent. By Problem 5.10, w is a linear combination of the i^j, that is, w G L(Vi). Hence
S C L(Vi). This leads to V = L{S) C L(vi) C V. Accordingly, {vJ generates V and, since it is m
dependent, it is a basis of V.
CHAP. 5] BASIS AND DIMENSION 101
^ 5.17. Suppose V is generated by a finite set S. Show that V is of finite dimension and, in
particular, a subset of S is a basis of V.
Method 1. Of all the independent subsets of S, and there is a finite number of them since S is finite,
one of them is maximal. By the preceding problem this subset of S is a basis of V.
Method 2. If S is independent, it is a basis of V. If S is dependent, one of the vectors is a linear
combination of the preceding vectors. We may delete this vector and still retain a generating set.
We continue this process until we obtain a subset which is independent and generates V, i.e. is a
basis of V.
^ 5.18. Prove Theorem 5,6: Let V be of finite dimension n. Then:
(i) Any set of n + 1 or more vectors is linearly dependent.
(ii) Any linearly independent set is part of a basis,
(iii) A linearly independent set with n elements is a basis.
Suppose {ei, . . . , e„} is a basis of V.
(i) Since {ei, ...,«„} generates V, any m + 1 or more vectors is dependent by Lemma 5.4.
(ii) Suppose {v^, . . . , v,.} is independent. By Lemma 5.4, V is generated by a set of the form
S = {vi, ...,Vr,e,^, ...,ei^_^}
By the preceding problem, a subset of S^ is a basis. But S contains n elements and every basis
of V contains n elements. Thus S is a basis of V and contains {vi, . . .,v^} as a subset.
(iii) By (ii), an independent set T with n elements is part of a basis. But every basis of V contains
n elements. Thus, T is a basis.
5.19. Prove Theorem 5.7: Let Whe a subspace of an 7idimensional vector space V. Then
dim W^n. In particular, if dim W = n, then W=V.
Since V is of dimension n, any w + 1 or more vectors are linearly dependent. Furthermore, since
a basis of W consists of linearly independent vectors, it cannot contain more than n elements.
Accordingly, dim W — n.
In particular, if {w^, . . . , w„} is a basis of W, then since it is an independent set with n elements
it is also a basis of V. Thus W = V when dim W = n.
5.20. Prove Theorem 5.8: dim(C7 + W) = dim U + dim W  dim{Ur\W).
Observe that U nW is a subspace of both U and W. Suppose dim V — m, dim W = n and
dim (Ur\W) = r. Suppose {v^, . . . , vj is a basis of UnW. By Theorem 5.6(ii), we can extend {■«{}
to a basis of U and to a basis of W; say,
{^1 Vr, Ml, ...,«„_,} and {vi, ..., v„M)i, ..., w„_,}
are bases of U and W respectively. Let
B  {Vi, ...,V„ Ml, . ..,«„_„ Wi, . . ., W„r}
Note that B has exactly 7111, + n — r elements. Thus the theorem is proved if we can show that B is a
basis of U\W. Since {v^, Uj} generates U and {v^, w^) generates W, the union B = {vj, Uj, w^}
generates U +W. Thus it suffices to show that B is independent.
Suppose
aiVi + • • • + a^v^ + 61M1 + • ■ • + 6„_,.M„_,. + Ci^i + • • • + c„_,.m;„_,. = (1)
where Oj, bj, c^ stte scalars. Let
V = aiVi + • • • + a^Vr + 61M1 + • • • + bn^u^_^ (2)
102 BASIS AND DIMENSION [CHAP. 5
By (1), we also have that
^n — r *^n — r
(S)
Since {vi,Uj} cU, v e U hy (2); and since {w J c TF, v €: W by (5). Accordingly, vGUnW.
Now {Vi) is a basis of UnW and so there exist scalars di, . . . , d, fo"" which v = dj^i + • • • + d^v^.
Thus by (^) we have
diVi + • • ■ + d^V^ + CiWi + ■ • • + C„_rW„_r =
But {vi, Wfc} is a basis of W and so is independent. Hence the above equation forces Cx = 0, . . . ,
c„_, = 0. Substituting this into (1), we obtain
aiVi + • • • + a^Vr + bjUi + • • ■ + b^nrUmr —
But {f j, Uj} is a basis of U and so is independent. Hence the above equation forces «! = 0, . . . ,
a, = 0, 61 = 0, . . . , 6to_^ = 0.
Since the equation {1) implies that the aj, 6^ and c^ are all 0, B = {vi, Uj, w^} is independent
and the theorem is proved.
5.21. Prove Theorem 5.9: The row rank and the column rank of any matrix are equal.
Let A be an arbitrary m X w matrix:
jail ai2 ■ ■ ■ «in
A _ I <*21 "'22 • • • ^2n
[O'ml <'m2 • • • '^r
Let Ri, B2 Bm denote its rows:
i?j = (fflu, O12, • • • . <lin)> • • • > ^m — (<*ml> '*m2> • • • > '*mn)
Suppose the row rank is r and that the following r vectors form a basis for the row space:
Si = (611, 612, • ■ • , f>i„), S2  (621. 622. • • • > ^2n). • • • . ^r = i^rV ^r2. • • • > *m)
Then each of the row vectors is a linear combination of the S;:
R^ ^^ ^11*^1 "■" '^i2'^2 ~r ' ' ' "T" /CjjOj.
/?2 — 'l'21'S'l ^" fe22'^2 + ■ ■ ■ + ^2r"r
fin» ~ "'ml"'l "^" "'m2"2 + " ' " + l^mr^r
where the fcy are scalars. Setting the ith components of each of the above vector equations equal to
each other, we obtain the following system of equations, each valid for i = 1, . . . , w:
«li = ^^ll^li + ^12621 + • • ■ + ^Ir^ri
ttgi = fe21*li + *'22&2i + • • ■ + k2rbri
«m« — ^mlbii + fcm2*2i + ' " ' + k^rbr
Thus for i = 1, . . . , w:
In other words, each of the columns of A is a linear combination of the r vectors
■I'l
CHAP. 5] BASIS AND DIMENSION 103
Thus the column space of the matrix A has dimension at most r, i.e. column rank — r. Hence, column
rank — row rank.
Similarly (or considering the transpose matrix A*) we obtain row rank — column rank. Thus
the row rank and column rank are equal.
BASIS AND DIMENSION
5.22. Determine whether or not the following form a basis for the vector space R^:
(i) (1, 1, 1) and (1, 1, 5) (iii) (1, 1, 1), (1, 2, 3) and (2, 1, 1)
(ii) (1, 2, 3), (1, 0, 1), (3, 1, 0) (iv) (1, 1, 2), (1, 2, 5) and (5, 3, 4)
and (2, 1, 2)
(i) and (ii). No; for a basis of R3 must contain exactly 3 elements, since R^ is of dimension 3.
(iii) The vectors form a basis if and only if they are independent. Thus form the matrix whose
rows are the given vectors, and row reduce to echelon form:
to
The echelon matrix has no zero rows; hence the three vectors are independent and so form a
basis for R^.
(iv) Form the matrix whose rows are the given vectors, and row reduce to echelon form:
to
The echelon matrix has a zero row, i.e. only two nonzero rows; hence the three vectors are
dependent and so do not form a basis for R3.
5.23. Let W be the subspace of R* generated by the vectors (1, —2, 5, 3), (2, 3, 1, —4) and
(3, 8, —3, —5). (i) Find a basis and the dimension of W. (ii) Extend the basis of W
to a basis of the whole space R*.
(i) Form the matrix whose rows are the given vectors, and row reduce to echelon form:
/l 2 5 3\
to 79 2 to
1
2
5
3
7
9
2
14
18
4
1
2
5
3
7
9
2
The nonzero rows (1, —2, 5, —3) and (0, 7, —9, 2) of the echelon matrix form a basis of the row
space, that is, of W. Thus, in particular, dim W = 2.
(ii) We seek four independent vectors which include the above two vectors. The vectors (1, —2, 5, —3),
(0, 7, —9, 2), (0, 0, 1, 0) and (0, 0, 0, 1) are independent (since they form an echelon matrix), and
so they form a basis of R* which is an extension of the basis of W.
5.24. Let W be the space generated by the polynomials
V2 = 2«3 st^' + Qtl Vi = 2t^ ~5P + 7t + 5
Find a basis and the dimension of W.
The coordinate vectors of the given polynomials relative to the basis {f3, t^, t, 1} are respectively
[vi] = (1,2,4,1) K] = (1,0,6,5)
[v^] = (2,3,9,1) K] = (2,5,7,5)
104 BASIS AND DIMENSION [CHAP. 5
Form the matrix whose rows are the above coordinate vectors, and row reduce to echelon form:
1
2
4
1
1
1
3
to I „ _ to
The nonzero rows (1, —2, 4, 1) and (0, 1, 1, —3) of the echelon matrix form a basis of the space
generated by the coordinate vectors, and so the corresponding polynomials
«3 _ 2*2 + 4t + 1 and t^ + t  3
form a basis of W. Thus dim W = 2.
5.25. Find the dimension and a basis of the solution space W of the system
x + 2y + 2z s + St =
x + 2y + 3z + s + i =
Sx + 6y + 8z + s + 5t =
Reduce the system to echelon form:
X + 2y + 2z  s + 3t =
^ x + 2y + 2z s + 3t =
z + 2s 2t = or
z + 2s  2t =
2z + 4s  4« =
The system in echelon form has 2 (nonzero) equations in 5 unknowns; hence the dimension of the
solution space PT is 5 — 2 = 3. The free variables are y, s and t. Set
(i) y = l,s = 0,t0, (ii) J/ = 0, s = 1, t = 0, (iii) y = 0, s^O, tl
to obtain the respective solutions
vi = (2, 1, 0, 0, 0), V2 = (5, 0, 2, 1, 0), ^3 = (7, 0, 2, 0, 1)
The set {v^, V2, f 3} is a basis of the solution space W.
5.26. Find a homogeneous system whose solution set W is generated by
{(1, 2, 0, 3), (1, 1, 1, 4), (1, 0, 2, 5)}
Method 1. Let v = (x, y, z, w). Form the matrix M whose first rows are the given vectors and whose
last row is v; and then row reduce to echelon form:
/I 2 3\ /I 2 3 \ /I 2 3
„ _ I 1 1 1 4 r 1 1 1 ,.0 1 1 1
IO25II0 2 2 2 \^\(iQ2x + y + zhxy + w^
\x y
\Q 2x + y z 3x + w/ \0
The original first three rows show that W has dimension 2. Thus v&WH and only if the addi
tional row does not increase the dimension of the row space. Hence we set the last two entries
in the third row on the right equal to to obtain the required homogeneous system
2x + y + z =0
5x + y —w =
Method 2. We know that v = (x,y,z,w) G P7 if and only if d is a linear combination of the gen
erators of W: , „ „ „ ,.v
(x, y, z, w) = r(l, 2, 0, 3) + 8(1, 1, 1, 4) + t(l, 0, 2, 5)
The above vector equation in unknowns r, s and t is equivalent to the following system:
CHAP. 5] BASIS AND DIMENSION 105
r + s + t = X r + 8 + t = oe r + s + t = x
2r s = y 8 + 2t = 2x + y 8 + 2t = 2x + y
s2t = z °^ s2t = z °^ = 2x + y + z ^^^
3r + 4s + 5t = w 8 + 2t = w — 3x = 5x + y — w
Thus V E. W ii and only if the above system has a solution, i.e. if
2x + y + z =0
5x + y — w =
The above is the required homogeneous system.
Remark: Observe that the augrmented matrix of the system (1) is the transpose of the matrix
M used in the first method.
5.27. Let U and W be the following subspaces of R*:
U = {{a,b,c,d): b + c + d = 0}, W = {{a,b,c,d): a + b = 0, c = 2d}
Find the dimension and a basis of (i) U, (ii) W, (iii) UnW.
(i) We seek a basis of the set of solutions (a, 6, c, d) of the equation
b + c + d = or 0a + b + c + d =
The free variables are a, c and d. Set
(1) a = 1, c = 0, d = 0, (2) a = 0, c = 1, d = 0, (3) a = 0, c = 0, d = 1
to obtain the respective solutions
^•l = (1,0,0,0), V2 = (0,1,1,0), 1^3 = (0,1,0,1)
The set {v^, v^, v^ is a basis of U, and dim U = 3.
(ii) We seek a basis of the set of solutions (a, 6, c, d) of the system
a + 6 = a + 6 =
or
c = 2d c  2d =
The free variables are 6 and d. Set
(i) 6 = 1, d = 0, (2) 6 = 0, d = 1
to obtain the respective solutions
Vy = (1, 1, 0, 0), V2 = (0, 0, 2, 1)
The set {■Ui, v^ is a basis of W, and dim W = 2.
(iii) U r^W consists of those vectors (a, b, c, d) which satisfy the conditions defining U and the con
ditions defining W, i.e. the three equations
6+c+d=0 a+b =0
0+6 =0 or 6 + c+d =
c = 2d c  2d =
The free variable is d. Set d = 1 to obtain the solution v = (3, 3, 2, 1). Thus {v} is a basis
of VnW, and Aim (V nW) = \.
5.28. Find the dimension of the vector space spanned by:
(i) (1, 2,3, 1) and (1, 1, 2, 3) (v) {\ ^J and ^
(ii) (3, 6, 3, 9) and (2, 4, 2, 6) / 1 l\ /3 3
(iii) t^ + 2f2 + 3i + 1 and 2t^ + 4*^ + 6f + 2 ^^^^ (^i i^
(iv) i»2f2 + 5 and i^ + 34 _ 4 (vii) 3 and 3
Oj \y zj \0
106 BASIS AND DIMENSION [CHAP. 5
Two nonzero vectors span a space W of dimension 2 if they are independent, and of dimension
1 if they are dependent. Recall that two vectors are dependent if and only if one is a multiple of
the other. Hence: (i) 2, (ii) 1, (iii) 1, (iv) 2, (v) 2, (vi) 1, (vii) 1.
5.29. Let V be the vector space of 2 by 2 symmetric matrices over K. Show that
dim 7 = 3. (Recall that A = (ay) is symmetric iff A = A* or, equivalently, an = an.)
I a h\
An arbitrary 2 by 2 symmetric matrix is of the form A = ( 1 where a, 6, c G A.
(Note that there are three "variables".) Setting ^
(i) a = 1, 6 = 0, c = 0, (ii) a = 0, 6 = 1, c = 0, (iii) a = 0, 6 = 0, c = l
we obtain the respective matrices
. = (;:)• . = (?;)■ • = (::)
We show that {E^, E^, E^) is a basis of V, that is, that it (1) generates V and (2) is independent.
(1) For the above arbitrary matrix A in V, we have
A = ("' ^\ = aEi + bE2 + cEs
Thus {^1, E^, Eg} generates V.
(2) Suppose xEi + yE^ + zE^ = 0, where x, y, z are unknown scalars. That is, suppose
Ki o) + ^(i J) + K2 1) 
Setting corresponding entries equal to each other, we obtain x = 0, y = 0, z = 0 In other words,
xEi + yE2 + zEs = implies x = 0, y = 0, z =
Accordingly, {EijEz.Ea} is independent.
Thus {El, E2, E3} is a basis of V and so the dimension of V is 3.
5.30. Let V be the space of polynomials in t of degree ^ n. Show that each of the following
is a basis of V:
(i) {1, «, t^ . . . , «">, «»}, (ii) {1, 1  1, (1  tf, . . . , (1  «)"s (1  «)"}.
Thus dim V = n + 1.
(i) Clearly each polynomial in y is a linear combination of l,t, ...,t"i and t». Furthermore,
1, t, . . ., t"i and t» are independent since none is a linear combination of the preceding poly
nomials. Thus {1, t t«} is a basis of V.
(ii) (Note that by (i), dim V = w+ 1; and so any ml 1 independent polynomials form a basis of
V.) Now each polynomial in the sequence 1, 1  t (1  t)» is of degree higher than the
preceding ones and so is not a linear combination of the preceding ones. Thus the w + 1 poly
nomials 1, 1 — t, . . . , (1 — t)» are independent and so form a basis of V.
5.31. Let V be the vector space of ordered pairs of complex numbers over the real field R
(see Problem 4.42), Show that V is of dimension 4.
We claim that the following is a basis of V:
B = {(1, 0), (i, 0), (0, 1), (0, t)}
Suppose u e V. Then v = («, w) where z, w are complex numbers, and so v = (a+bi,e + di) where
o, 6, c, d are real numbers. Then
V = ail, 0) + 6(t, 0) + c(0, 1) + d(0, t)
Thus B generates V.
CHAP. 5]
BASIS AND DIMENSION
107
The proof is complete if we show that B is independent. Suppose
xi(l, 0) + X2(i, 0) + a;3(0, 1) + 0:4(0, i) =
where Xi, x^, x^, X4 G R. Then
{xi + x^, Xs + x^x) = (0, 0) and so
Xx 'r x^i =
«3 V x^i — Q
Accordingly x^ = 0, a;2 = 0, as = 0, a;4 = and so B is independent.
5.32. Let Y be the vector space of m x w matrices over a field K. Let E^ G V be the matrix
with 1 as the i^entry and elsewhere. Show that {E,^ is a basis of 7. Thus
dimV = mn.
We need to show that {By} generates V and is independent.
Let A = (tty) be any matrix in V. Then A = 2 o,^E^^. Hence {By} generates V.
Now suppose that 2 ajjii^ij = where the «{. are scalars. The yentry of 2 «ij^ij is ««. and
the yentry of is 0. Thus asy = 0, i = 1, . . ., w, j = 1 m. Accordingly the matrices By are
independent.
Thus {By} is a basis of V.
Remark: Viewing a vector in K» as a 1 X w matrix, we have shown by the above result that the
usual basis defined in Example 5.3, page 88, is a basis of X" and that dim K" = w.
SUMS AND INTERSECTIONS
5.33. Suppose TJ and W are distinct 4dimensional subspaces of a vector space Y of dimen
sion 6. Find the possible dimensions of TJV^W.
Since V and W are distinct, V VW properly contains 17 and W; hence dim(f7+W)>4.
But dim(?7+W) cannot be greater than 6, since dimV = 6. Hence we have two possibilities:
(i) dim(U+T7) = 5, or (ii) dim (U + PF) = 6. Using Theorem 5.8 that dim(f7+ T^) = dim U +
dim W — dim (Un TF), we obtain
(i) 5 = 4 + 4 dim(f/nW) or dim(t7nW) = 3
(ii) 6 = 4 + 4 dim(?7nW) or dim(t/nTF) = 2
That is, the dimension of TJ r\'W must be either 2 or 3.
5.34. Let J] and W be the subspaces of R* generated by
{(1, 1, 0, 1), (1, 2, 8, 0), (2, 3, 3, 1)} and {(1, 2, 2, 2), (2, 3, 2, 3), (1, 3, 4, 3)}
respectively. Find (i) dim (C/ + TF), (ii) dim(C7nW).
(i) TJ^W is the space spanned by all six vectors. Hence form the matrix whose rows are the
given six vectors, and then row reduce to echelon form:
to
to
to
Since the echelon matrix has three nonzero rows, dim iJJ VW) — Z.
108
BASIS AND DIMENSION
[CHAP. 5
(ii) First find dim U and dim W. Form the two matrices whose rows are the generators of U and
W respectively and then row reduce each to echelon form:
and
1
1
1
1
2
3
2
3
3
1
1
2
2
2
2
3
2
3
1
3
4
3
to
to
to
to
1
1
1
1
3
1
0.
'l
2
2
2
1
2
1
Since each of the echelon matrices has two nonzero rows, dim V — 2 and dim W = 2. Using
Theorem 5.8 that dim (V +W) = dim U + dim W  dim (UnW), we have
3 = 2 +2dim(!7nW) or Aim{Ur\W) = 1
5.35. Let U be the subspace of R^ generated by
{(1, 3, 2, 2, 3), (1, 4, 3, 4, 2), (2, 3, 1, 2, 9)}
and let W be the subspace generated by
{(1, 3, 0, 2, 1), (1, 5, 6, 6, 3), (2, 5, 3, 2, 1)}
Find a basis and the dimension of (i) XJ + W, (ii) f/n W.
(i) U +W is the space generated by all six vectors. Hence form the matrix whose rows are the
six vectors and then row reduce to echelon form:
/I 32 2 3\
11 21
03 36
*° ' 2 02
2440
\0 1 7 2 5/
to
1
3
2
2
3
1
4
3
4
2
2
3
1
2
9
1
3
2
1
1
5
6
6
3
2
5
3
2
1
1
3
2
2
3
1
1
2
1
2
2
2
2
6
6
to
1
3
2
2
3
1
1
2
1
2
2
The set of nonzero rows of the echelon matrix,
{(1, 3, 2, 2, 3), (0, 1, 1, 2, 1), (0, 0, 2, 0, 2)}
is a basis oiV+W; thus dim (t/ + TF) = 3.
(ii) First find homogeneous systems whose solution sets are U and W respectively. Form the matrix
whose first rows are the generators of U and whose last row is (», y, z, s, t) and then row reduce
to echelon form:
2
1
3
2
2
6
3
1
3
3x + y 2x + z 2x + s Sx + t j
4a;
CHAP. 5]
BASIS AND DIMENSION
109
Set the entries of the third row equal to to obtain the homogeneous system whose solution
set is U:
X + y + z = Q, 4a;  22/ + 8 = 0, 6a; + y \ t =
Now form the matrix whose first rows are the generators of W and whose last row is
(x, y, z, 8, t) and then row reduce to echelon form:
to
9aj + 3y + z 'Ix — 2y + s 2x — y + t
Set the entries of the third row equal to to obtain the homogeneous system whose solution
—9x + 3y + z = 0, 4x — 2y + s = 0, 2x  y + t =
Combining both systems, we obtain the homogeneous system whose solution set is U nW:
—x+y + z =0
4x 2y +8 =0
6a; + y + t =
9x + 3y + z 
4a; — 2j/ + s =0
2a;  J/ + t =
—x + y+z =0
2y + 4z + 8 =0
8z + 58 + 2t 
4z + 3s =0
s  2t =
—x+y+z —
2y + Az + s 
5y  6z + t =
6y  8z =0
2y + iz + s =0
^ + 2z + t =
x + y+z =0
2y + Az + 8 =
8z + 5s + 2f =
8  2t =
There is one free variable, which is t; hence dim(l7nT^ = 1. Setting t = 2, we obtain the
solution a; = 1, 2/ = 4, z = 3, 8 = 4, t = 2. Thus {(1, 4, 3, 4, 2)} is a basis of UnW.
COORDINATE VECTORS
5^6. Find the coordinate vector of v relative to the basis {(1, 1, 1), (1, 1, 0), (1, 0, 0)} of R^
where (i) v = (4, 3, 2), (ii) v = (a, 6, c).
In each case set v aa a linear combination of the basis vectors using unknown scalars x, y and z:
V = a;(l, 1, 1) + j/(l, 1, 0) + z(l, 0, 0)
and then solve for the solution vector {x,y,z). (The solution is unique since the basis vectors are
linearly independent.)
(i) (4,3,2) = a;(l, 1, 1) + j/(l, 1, 0) + z(l, 0, 0)
= (a;, X, x) + {y, y, 0) + (z, 0, 0)
= (x + y + z,x + y,x)
Set corresponding components equal to each other to obtain the system
x + y + z = A, X + y = —3, a; = 2
Substitute « = 2 into the second equation to obtain y = —5; then put x = 2, y = —5 into
the first equation to obtain z = 7. Thus x = 2, y = 5, z = 7 is the unique solution to the
system and so the coordinate vector of v relative to the given basis is [v] = (2, —5, 7).
110 BASIS AND DIMENSION [CHAP. 5
(ii) {a, b, c)  «(1, 1, 1) + 1/(1, 1, 0) + z{l, 0, 0) = (x + y + z,x + y,x)
Then x + y + z = a, x + y = b, x — c
from which x = c, y — b — c, z = a—b. Thus [v] — (c,b — c,a— b), that is, [(a, b, c)] —
{e, b — c, a — b).
5JS7. Let V be the vector space of 2 x 2 matrices over R. Find the coordinate vector of the
matrix A GV relative to the basis
{{I iHri)^{iiHi I)} w (!?
Set A as a linear combination of the matrices in the basis using unknown scalars x, y, z, w:
 C J)
"I i)*'(:i)'(ii)*<i :
\X x) '^ \y 0/ \0 0/
w
' X + z + w X — y — z^
X + y X
Set corresponding entries equal to each other to obtain the system
X + z + w = 2, X — y — z = 3, x + y =^ A, x — —1
from which x = l, y = ll, z 21, w = 30. Thus [A]  (7, 11, 21, 30). (Note that the co
ordinate vector of A must be a vector in R* since dim V = 4.)
5^8. Let W be the vector space of 2 x 2 symmetric matrices over R. (See Problem 5.29.)
Find the coordinate vector of the matrix ^ — [ , ^ „ ) relative to the basis
1 2\ /2 1\ / 4 1\1 ^ ^
2 1/' vi sy \i 5/j •
Set A as a linear combination of the matrices in the basis using unknown scalars x, y and z:
/ 4 11\ /I 2\ , /2 1\ , / 4 1\ /x + 2y + 4:Z 2x + y  z
^ = (_n 7J = %2 ij + ^l 3J + %l 5) = (,2x + y. . + 3,5.
Set corresponding entries equal to each other to obtain the equivalent system of linear equations
and reduce to echelon form:
X + 2y + Az = 4
—2x + y  z — 11
2x + y  z = 11
X + Sy  5z = 7
X + 2y + iz = 4 X + 2y + iz  4
5y + 7z  3 or 5y + Iz  3
J/  9z = 11 52z  52
We obtain « = 1 from the third equation, then y = —2 from the second equation, and then a = 4
from the first equation. Thus the solution of the system is x = 4, y — —2, z — 1; hence [A] —
(4, —2, 1). (Since dim W = 3 by Problem 5.29, the coordinate vector of A must be a vector in K*.)
5.39. Let {ei, 62, es} and {/i, h, fa} be bases of a vector space V (of dimension 3). Suppose
ei = Ci/i + ttaA + 03/3
62 — hifi+b^fi + hifz (i)
63 = Ci/i + C2/2 + C3/3
Let P be the matrix whose rows are the coordinate vectors of ei, ea and es respectively,
relative to the basis {fi}:
CHAP. 5]
BASIS AND DIMENSION
111
{tti 02 as \
61 &2 63
Cl C2 C3 ^
Show that, for any vector v GV, [v]eP = [vy. That is, multiplying the coordinate
vector of v relative to the basis {ei} by the matrix P, we obtain the coordinate vector
of V relative to the basis {/<}. (The matrix P is frequently called the change of basis
matrix,)
Suppose V = rei + seg + teg; then [v]^ = (r,8,t). Using (i), we have
V = r(aJi + O2/2 + agfa) + si^ifi + ^2/2 + ^3/3) + *(«i/i + "2/2 + "3/3)
= (roi + s6i + tci)/i + {ra2 + S62 + fc2)/2 + {ras + sb^ + tcs)^
Hence [v]f — {rai + sbi + tc^, ra2 + sb2+ tC2, ra3 + sb3+ tcg)
On the other hand, / „
ai a2 (I3
[v],P = (r, s, t) 61 62 63
\ Cl C2 C3
= (rtti + s6i + tCj, ra2 + 862 + tc2, ra^ + S63 + tcg)
Accordingly, [v\eP — ['"]/•
Remark: In Chapter 8 we shall write coordinate vectors as column vectors rather than row
vectors. Then, by above,
'ax 61 Ci\/r\ / rai + s6i + tCx '
QMe = I a2 62 «2lS
&3 Cg/ \ t ,
ra2 + 362 + *C2
irfflg + S&3 + tCgy
f
where Q is the matrix whose columns are the coordinate vectors of ©i, e^ ^^^d H respectively, relative
to the basis {/J. Note that Q is the transpose of P and that Q appears on the left of the column
vector \v\g whereas P appears on the right of the row vector [1;]^.
RANK OF A MATRIX
5.40. Find the rank of the matrix A where:
(i) A =
/I
3
1
2
3
u
4
3
1
4
l^
3
4
7
3
\3
8
1
7
8
(ii) A =
(iii) A
(i) Row reduce to echelon form:
/I
1
2
3
to
3 12 3\
4 314
3 4 7 3
8 17 8/
Since the echelon matrix has two nonzero rows, rank (A) = 2
to
(ii) Since row rank equals column rank, it is easier to form the transpose of A and then row
reduce to echelon form:
to
to
Thus rank (A) = 3.
112 BASIS AND DIMENSION [CHAP. 5
(iii) The two columns are linearly independent since one is not a multiple of the other. Hence
rank (A) = 2.
5.41. Let A and B be arbitrary matrices for which the product AB is defined. Show that
rank (AB)^ rank (J?) and rank (AB) ^ rank (A).
By Problem 4.33, page 80, the row space of AB is contained in the row space of B; hence
rank (AB) — rank (B). Furthermore, by Problem 4.71, page 84, the column space of AB is contained
in the column space of A; hence rank (AB) — rank (A).
5.42. Let A be an nsquare matrix. Show that A is invertible if and only if rank (A) = n.
Note that the rows of the wsquare identity matrix /„ are linearly independent since /„ is in
echelon form; hence rank (/„) = n. Now if A is invertible then, by Problem 3.36, page 57, A is row
equivalent to /„; hence rank (A) = n. But if A is not invertible then A is row equivalent to a matrix
with a zero row; hence rank (A) < n. That is, A is invertible if and only if rank (A) = n.
5.43. Let JCij, Xi^, . . . , a;i^ be the free variables of a homogeneous system of linear equations
with n unknowns. Let Vj be the solution for which: x^ — 1, and all other free varia
bles = 0. Show that the solutions Vi, V2, . . .,Vk are linearly independent.
Let A be the matrix whose rows are the Vi respectively. We interchange column 1 and column
ii, then column 2 and column ^2, . . ., and then column k and column i^l and obtain the kXn matrix
1
.
.
"l.k + l ■
■ Ci„
1
.
.
''2, k + 1
• C2„
= (/, C)
\o ... 1 Cfc,fc + i ... c„„/
The above matrix B is in echelon form and so its rows are independent; hence rank (B) = k. Since
A and B are column equivalent, they have the same rank, i.e. rank (A) = k. But A has k rows;
hence these rows, i.e. the iij, are linearly independent as claimed.
MISCELLANEOUS PROBLEMS
5.44. The concept of linear dependence is extended to every set of vectors, finite or infinite,
as follows: the set of vectors A = {vi} is linearly dependent iflf there exist vectors
Vi^, . . .,Vi^G A and scalars ai, . . . , a„ S Z, not all of them 0, such that
aiVi^ + aiVi^ + • ■ • + a„i7i„ =
Otherwise A is said to be linearly independent. Suppose that Ai, A2, . . . are linearly
independent sets of vectors, and that AiCAzC. Show that the union A =
A1UA2U • • • is also linearly independent.
Suppose A is linearly dependent. Then there exist vectors v^ v^& A and scalars «!,...,
o„ G K, not all of them 0, such that
a^Vi + a2V2 + • • • + a^Vn = (1)
Since A = uAj and the ■Wj S A, there exist sets A,, . . ., A;^ such that
v^eAi^, v^eAi^, ..., v„eAi^
Let k be the maximum index of the sets Aj.: k = max (ij, . . . , i„). It follows then, since Ai C Ag c • ■ • ,
that each Aj. is contained in A^. Hence •Ui,'y2, • ■ .yn 6 ^k and so, by {1), A^ is linearly dependent,
which contradicts our hypothesis. Thus A is linearly independent.
CHAP. 5] BASIS AND DIMENSION 113
5.45. Consider a finite sequence of vectors S — {vuV2, . . .,v„}. Let T be the sequence of
vectors obtained from S by one of the following "elementary operations": (i) inter
change two vectors, (ii) multiply a vector by a nonzero scalar, (iii) add a multiple of
one vector to another. Show that S and T generate the same space W. Also show
that T is independent if and only if S is independent.
Observe that, for each operation, the vectors in T are linear combinations of vectors in S. On
the other hand, each operation has an inverse of the same type (Prove!); hence the vectors in S are
linear combinations of vectors in T. Thus S and T generate the same space W. Also, T is inde
pendent if and only if dim W — n, and this is true iff S is also independent.
5.46. Let A — (ay) and B = (by) be row equivalent mXn matrices over a field K, and let
vi, . . .,Vn be any vectors in a vector space V over K. Let
Ml = ttiiVi + anVi + •••! ainVn Wi = 6iit;i I bi2V2 +•••! binVn
Ui — a^iVl + a22V2 + • • • + a2nVn W2 = b2lVl + b22V2 I • • • I &2nVn
Um = OmlVl + am2V2 + • • • "I" OmnVn Wm = bmlVl + bm2V2 + • • • + bmnVn
Show that [Vd] and (w.} generate the same space.
Applying an "elementary operation" of the preceding problem to {mJ is equivalent to applying
an elementary row operation to the matrix A. Since A and B are row equivalent, B can be obtained
from A by a sequence of elementary row operations; hence {tyj can be obtained from {mJ by the
corresponding sequence of operations. Accordingly, {mj} and {wj generate the same space.
5.47. Let Vi, . . .,Vn belong to a vector space V over a field K. Let
Wi = aiiVi I ai2V2 + • • ■ + ainVn
W2 = a2iVi + a22V2 +•••! a2nVn
Wn — dnlVl + an2V2 "I" ■ • • "I" annVn
where Oij S K. Let P be the nsquare matrix of coefficients, i.e. let P = {an).
(i) Suppose P is invertible. Show that {wi) and {vi} generate the same space; hence
{wi) is independent if and only if {vi) is independent.
(ii) Suppose P is not invertible. Show that {wi) is dependent.
(iii) Suppose {wi) is independent. Show that P is invertible.
(i) Since P is invertible, it is row equivalent to the identity matrix /. Hence by the preceding
problem {wj and {vi^ generate the same space. Thus one is independent if and only if the
other is.
(ii) Since P is not invertible, it is row equivalent to a matrix with a zero row. This means that
{wit generates a space which has a generating set of less than n elements. Thus {wj is
dependent.
(iii) This is the contrapositive of the statement of (ii), and so it follows from (ii).
5.48. Suppose V is the direct sum of its subspaces U and W, i.e. V = II®W. Show that:
(i) if (Ml, . . .,tfot}cC/ and {wi, . . . , Wn) CW are independent, then {Vn,Wj) is also
independent; (ii) dim V = dim U + dim W.
(i) Suppose a^Ui + • • • + (i„,m„, + 6iWi +•••! b„w„ = 0, where ttj, bj are scalars. Then
= (ajMi I • ■ • h ttrnWrn) + (^I'^l + • • • + b„Wn) = +
114 BASIS AND DIMENSION [CHAP. 5
where 0, a^u^ + • • • + a^Mm ^ U and 0, h^Wi + • • • + 6n'"'n ^ '^^ Since such a sum for is
unique, this leads to
axUi + • • • + a„Mm = 0, bjWi + • • • + h^w^^ —
The independence of the tij implies that the Oj are all 0, and the independence of the Wj implies
that the 6j are all 0. Consequently, {mj, Wj} is independent.
(ii) Method 1. Since V U ®W, we have V =U+W and Ur\W  {0}. Thus, by Theorem
5.8, page 90,
dimy = dimC7 + diml?F  dim(f/nW') = dim 1/ + dim W  = dim U + dim T^
Method 2. Suppose {u^, ...,u^) and {t«i, . . . , w^ are bases of U and W respectively. Since
they generate TJ and W respectively, {mj, Wj) generates V = V +W. On the other hand, by
(i), (mj, Wj) is independent. Thus {u^, w^ is a basis of V; hence dim V = dim ?7 + dim W.
5.49. Let [/ be a subspace of a vector space V of finite dimension. Show that there exists
a subspace W oiV such that F = U®W.
Let {mj, . . . , m,} be a basis of U. Since {mJ is linearly independent, it can be extended to a
basis of V, say, {mi, . . . , m„ ^i wJ. Let W be the space generated by {w^, . . . , w,}. Since
{Mj,Wi} generates V, V = V + W. On the other hand, C/ n H' = {0} (Problem 5.62). Accordingly,
y = ?7 © T17.
5.50. Recall (page 65) that if X is a subfield of a field E (or: £? is an extension of K), then
E may be viewed as a vector space over K. (i) Show that the complex field C is a
vector space of dimension 2 over the real field R. (ii) Show that the real field R is a
vector space of infinite dimension over the rational field Q.
(i) We claim that {l,i} is a basis of C over R. For if v&C, then v = a + hi = a'l\b'i
where a, 6 e K; that is, {1, i} generates C over R. Furthermore, if x'l + y'i = Q or
x + yi  0, where a;,j/ S R, then a; = and y = 0; that is, {l,i} is linearly independent
over R. Thus {1, t} is a basis of C over R, and so C is of dimension 2 over R.
(ii) We claim that, for any n, {1, Tr.tr^, . . ., tj"} is linearly independent over Q. For suppose
aol + ffliff + a^ifi + ■ • • + a„7r" = 0, where the aj G Q, and not all the aj are 0. Then ;r is a
root of the following nonzero polynomial over Q: a^ + a^x + ajOi^ + • • • + o„x". But it can be
shown that tt is a transcendental number, i.e. that ir is not a root of any nonzero polynomial
over Q. Accordingly, the w+1 real numbers 1, jr, jt^, . . .,jr" are linearly independent over Q.
Thus for any finite n, R cannot be of dimension n over Q, i.e. R is of infinite dimension over Q.
5.51. Let 2«: be a subfield of a field L and L a subfield of a field E: KgLcE. (Hence K is a
subfield of E.) Suppose that E is of dimension n over L and L is of dimension m
over K. Show that E is of dimension mn over K.
Suppose {vi, . . .,i;„} is a basis of E over L and {a^, . . .,a™} is a basis of L over K. We claim
that {aiVj. i = l,...,m,j = l,...,n} is a basis of E over K. Note that {aji^j} contains mn
elements.
Let w be any arbitrary element in E. Since {i;i, ...,v„} generates E over L, w is a linear com
bination of the ■Uj with coefficients in L:
W = 6i1)i + b2V2 + • • • + fen'Wn. &{ S ^ W
Since {oi, . . . , a^) generates L over K, each 64 S L is a linear combination of the a, with co
efficients in K:
61 = fciitti + fejafta + • • ■ + fcim^m
bn  '^nl«l + '^n2«2 + ' " ' + Km"'n
CHAP. 5] BASIS AND DIMENSION 115
•where k^ e K. Substituting in (1), we obtain
w = (kiitti H + ki^ajvi + (fcaitti + • • • + A;2„o„,)v2 + • • • + (Kia^ + • • • + k„^ajv„
 ^^U"!"! + •• • • + ki^amVl + 'C21«l'"2 + " * ' + fe2m«m''2 + ' " " + k^iaiV^ + • • • + fc„„a„V„
i.}
where fc^j e K. Thus w is a linear combination of the ajVj with coefficients in K; hence {ojVj} gen
erates E over X.
The proof is complete if we show that {a^Vj} is linearly independent over K. Suppose, for scalars
Xj, e X, 2 «n(aiVj) = 0; that is,
{xiidiVi + Xi2a2Vi + • • • + Xi,„a„,Vi) + • • • + {x^iaiV^ + x^^ch.i'n + • ■ • + a;„ma„v„) =
or (ajiitti + Kiattj + • • • + Xi,^a„)vi + • • ■ + (a;„ieii + x„2a2 + • • • + x„^a„)v„ =
Since {vi, . . ., v„} is linearly independent over L and since the above coefficients of the Vj belong to
L, each coefficient must be 0:
Kiiffli + Xi2a2 + ■ • • + a;im«m = 0, . . ., x^^Ui + x„2a2 + • • • + »„„«„ =
But {ffli, . . . , «„} is linearly independent over K; hence since the ar^i £ K,
«ii = 0, Xi2 = »im = 0, ..., «„! = 0, a;„2 = 0, ..., a;„m =
Accordingly, {ttjVj} is linearly independent over K and the theorem is proved.
Supplementary Problems
LINEAR DEPENDENCE
5.52. Determine whether u and v are linearly dependent where:
(i) u = (1, 2, 3, 4), V = (4, 3, 2, 1) (iii) m = (0, 1), v = (0, 3)
(ii) u = (1, 6, 12), V = (, 3, 6) (iv) u = (1, 0, 0), v = (0, 0, 3)
(vii) u = fi + t2  16, V = ^f3  :^t2 + 8 (viii) m = t^ + 3t + 4^ y = t^ + 4t + 3
5.53. Determine whether the following vectors in R* are linearly dependent or independent: (i) (1, 3, —1, 4),
(3,8,5,7), (2,9,4,23); (ii) (1,2,4,1), (2, 1,0,3), (3, 6,1,4).
5.54. Let V be the vector space of 2 X 3 matrices over R. Determine whether the matrices A,B,C B V
are linearly dependent or independent where:
1 2 3\ „ _ /I 1 4\ /3 8 7
4 5 2 7 ' I 2 10 1
<"'^=a4:). (4 I'D ^=(r4s
5.55. Let V be the vector space of polynomials of degree — 3 over R. Determine whether u,v,w GV are
linearly dependent or independent where:
(i) u = fS  4*2 + 2t + 3, V = ts + 2*2 + 4t  1, w  2fi  fi  3t + 5
(ii) u  t3 _ 5^2 _ 2t + 3, V  t^ 4t^ 3t + A, w = 2fi  7fi  It + 9
116 BASIS AND DIMENSION [CHAP. 5
5.56. Let V be the vector space of functions from R into R. Show that f,g,h e V are linearly independ
ent where: (i) f{t) = e«, g(t) = sin t, h(t) = th (ii) /(*) = e«, g(t) = e^*, hit) = t; (iii) /(f) = e«,
g{t) = sin t, h(t) = cos t.
5.57. Show that: (i) the vectors (1 — i, i) and (2, —1 + i) in C^ are linearly dependent over the complex
field C but are linearly independent over the real field R; (ii) the vectors (3 IV2, 1 + \/2) and
(7, 1 + 2'\/2 ) in R2 are linearly dependent over the real field R but are linearly independent over the
rational field Q.
5.58. Suppose u, v and w are linearly independent vectors. Show that:
(i) u + V — 2w, u — v — w and u + w are linearly independent;
(ii) u + V — 3w, u + 3v — w and v + w are linearly dependent.
5.59. Prove or show a counterexample: If the nonzero vectors u, v and w are linearly dependent, then w
is a linear combination of u and v.
5.60. Suppose Vi, v^, . . . , f „ are linearly independent vectors. Prove the following:
(i) {oi'Wi, a2>'2< • • •> "n'^'n} is linearly independent where each eij # 0.
(ii) {vi, . . .,v^l,w,v^^l, . . ., v„} is linearly independent where w = b^v^ + • • • + b^Vi + • • • + b„v„
and 5i # 0.
5.61. Let V = (a, 6) and w — (c, d) belong to K^. Show that {v, w} is linearly dependent if and only if
ad— be = 0.
5.62. Suppose {ui, . . . , m^, Wi, . . . , Wg} is a linearly independent subset of a vector space V. Show that
L(Mi) n L{Wj) = {0}. (Recall that L(Mj) is the linear span, i.e. the space generated by the Mj.)
5.63. Suppose (a^, ..., ai„), . . . , (ttmi, . . . , a^^) a^'e linearly independent vectors in Z", and suppose
■Ui, . . . , v„ are linearly independent vectors in a vector space V over K. Show that the vectors
Wi = au^'i + • • • + ffliB^n, . . ., Wm = a^i^i + • • • + «mn1'K
are also linearly independent.
BASIS AND DIMENSION
5.64. Determine whether or not each of the following forms a basis of R^:
(i) (1, 1) and (3, 1) (iii) (0, 1) and (0, 3)
(ii) (2, 1), (1, 1) and (0, 2) (iv) (2, 1) and (3, 87)
5.65. Determine whether or not each of the following forms a basis of R^:
(i) (1, 2, 1) and (0, 3, 1)
(ii) (2, 4, 3), (0, 1, 1) and (0, 1, 1)
(iii) (1, 5, 6), (2, 1, 8), (3, 1, 4) and (2, 1, 1)
(iv) (1, 3, 4), (1, 4, 3) and (2, 3, 11)
5.66. Find a basis and the dimension of the subspace W of R* generated by:
(i) (1, 4, 1, 3), (2, 1, 3, 1) and (0, 2, 1, 5)
(ii) (1, 4, 2, 1), (1, 3, 1, 2) and (3, 8, 2, 7)
5.67. Let V be the space of 2 X 2 matrices over R and let W be the subspace generated by
ui). (: \y (r:)  (4;;
Find a basis and the dimension of W.
5.68. Let W be the space generated by the polynomials
tt = (3 + 2*2  2t + 1, t) = t* + 3*2  « + 4 and w = 2*3 + t2  7t  7
Find a basis and the dimension of W.
CHAP. 5] BASIS AND DIMENSION 117
5.69. Find a basis and the dimension of the solution space W of each homogeneous system:
X + Sy + 2z = X  2y + 7z 
X + 4:y + 2z =
X + 5y + z  2x + 3y  2z =
2x+ y + 5z =
3x + 5y + 8z = 2x  y + z =
(i) (li) (iii)
5.70. Find a basis and the dimension of the solution space W of each homogeneous system:
X + 2y ~2z + 2s  t = x + 2y  z + 3s4t =
X + 2y  z + 3s  2t = 2x + 4y  2z  s + 5t =
2x + 4y ~ Iz + s + t = 2x + iy  2z + 4:S 2t 
(i) (ii)
5.71. Find a homogeneous system whose solution set W is generated by
{(1, 2, 0, 3, 1), (2, 3, 2, 5, 3), (1, 2, 1, 2, 2)}
5.72. Let V and W be the following subspaces of R*:
V = {(a,b,c,d): 62c + d = 0}, W = {{a,b,c,d): a = d, b = 2c}
Find a basis and the dimension of (i) V, (ii) W, (iii) VnW.
5.73. Let V be the vector space of polynomials in t of degree — n. Determine whether or not each of the
following is a basis of V:
(i) {1, 1 + t, l + t+t2, l+t+t2 + t3, ..., l + t+t2+ ••• +t"l + t«}
(ii) {1 + t, t+ fi, t^ + t», ..., t"2 + t"i, t»i + t"}.
SUMS AND INTERSECTIONS
5.74. Suppose V and W are 2dimensional subspaces of R3. Show that VnW ^ {0}.
5.75. Suppose U and W are subspaces of V and that dim U = 4, dim W = 5 and dim V  7. Find the
possible dimensions of U nW.
5.76. Let U and P7 be subspaces of R3 for which dim U  1, dim W = 2 and UlW. Show that
Rs = r; © w'.
5.77. Let U be the subspace of Rs generated by
{(1, 3, 3, 1, 4), (1, 4, 1, 2, 2), (2, 9, 0, 5, 2)}
and let W be the subspace generated by
{(1, 6, 2, 2, 3), (2, 8, 1, 6, 5), (1, 3, 1, 5, 6)}
Find (i) dim {U+W), (ii) dim (t/n VT).
5.78. Let V be the vector space of polynomials over R. Let U and W be the subspaces generated by
{t3 + 4*2  t + 3, t3 + 5*2 + 5^ 3*3 + 10(2  5t + 5} and {t^ + U^ + &,t^ + 2t^t + 5, 2*3 + 2*2  3* + 9}
respectively. Find (i) dim (f/ + W'), (ii) i\m.(VnW).
5.79. Let U be the subspace of RS generated by
{(1, 1, 1, 2, 0), (1, 2, 2, 0, 3), (1, 1, 2, 2, 1)}
and let W be the subspace generated by
{(1, 2, 3, 0, 2), (1, 1, 3, 2, 4), (1, 1, 2, 2, 5)}
(i) Find two homogeneous systems whose solution spaces are U and W, respectively,
(ii) Find a basis and the dimension of U r\W.
118 BASIS AND DIMENSION [CHAP. 5
COORDINATE VECTORS
5.80. Consider the following basis of B^: {(2, 1), (1, 1)}. Find the coordinate vector of vSU^ relative
to the above basis where: (i) i; = (2,3); (ii) v = (4,1), (iii) (3,3); (iv) v = (a,b).
5.81. In the vector space V of polynomials in t of degree  3, consider the following basis: {1, 1  t, (1  t)^,
(1 _ t)3}. Find the coordinate vector of v S y relative to the above basis if: (i) v = 2  3t + t* + 2t^;
(ii) i; = 3  2t  ^2; (iii) v = a + bt + ct^ + dt^.
5 82 In the vector space PF of 2 X 2 symmetric matrices over R, consider the following basis:
{(11). a :)•(' 1)}
Find the coordinate vector of the matrix AGW relative to the above basis if:
...  = (4 I) <"' ^<l I)
5.83. Consider the following two bases of R*:
{ei = (1, 1, 1), 02 = (0, 2, 3), 63 = (0, 2, 1)} and {/i = (1, 1, 0), /z = (1, 1. 0), fs = (0, 0, 1)}
(i) Find the coordinate vector ot v = (3,5,2) relative to each basis: [v]^ and [v]y.
(ii) Find the matrix P whose rows are respectively the coordinate vectors of the e, relative to the
basis {/i, /a, /a)
(iii) Verify that [v]eP=[v]f.
5.84. Suppose {e^, . . .,e„} and {fu ••../„} are bases of a vector space V (of dimension n). Let P be the
matrix whose rows are respectively the coordinate vectors of the e's relative to the basis {fih Prove
that for any vector veV, [v]^P = [v]f. (This result is proved in Problem 5.39 in the case n  3.)
5.85. Show that the coordinate vector of S F relative to any basis of V is always the zero wtuple
(0,0, ...,0).
RANK OF A MATRIX
5.86. Find the rank of each matrix:
5.87. Let A and B be arbitrary mXn matrices. Show that rank (A + B) ^ rank (A) + rank (B).
5.88. Give examples of 2 X 2 matrices A and B such that:
(i) rank (A + B)< rank (A), rank {B) (ii) rank (A + B) = rank (A) = rank (B)
(iii) rank (A + B) > rank (A), rank (B)
MISCELLANEOUS PROBLEMS
5.89. Let W be the vector space of 3 X 3 symmetric matrices over K. Show that dimW = 6 by ex
hibiting a basis of W. (Recall that A = (ay) is symmetric iff Oy = a^;.)
5.90. Let W be the vector space of 3 X 3 antisymmetric matrices over K. Show that dim II' = 3 by
exhibiting a basis of W. (Recall that A = (a„) is antisymmetric iff ««  a^j.)
5.91. Suppose dim y = n. Show that a generating set with w elements is a basis. (Compare with Theorem
5.6(iii), page 89).
CHAP. 5] BASIS AND DIMENSION 119
5.92. Let tj, «2, • ■ • . <B be symbols, and let K be any field. Let V be the set of expressions ajti + 03*2 +
• • • + a„t„ where aj G K. Define addition in V by
(ajfj + 02*2 + • • • + a„tj + (biti + 62*2 + • • • + 6„«„)
= (Oi + 6i)ti + (02 + 62)«2 + • • • + K + «>„)««
Define scalar multiplication on V by
fc(«i*i + «2*2 + • • • + o„<„) = ka^ti + ka^t^ + • • • + fea„t„
Show that y is a vector space over K with the above operations. Also show that {t^, . . . , t„} is a
basis of y where, for t = 1, . . . , n,
ti = Oti + • • • + Oti_i + l«i + Ofj + i + ■ • • + ot„
5.93. Let y be a vector space of dimension n over a field K, and let K he a, vector space of dimension m
over a subfield F. (Hence V may also be viewed as a vector space over the subfield F.) Prove that
the dimension of V over F is ww.
5.94. Let U and W be vector spaces over the same field K, and let V be the external direct sum of U and
TF (see Problem 4.45). Let t7 and Jy be the subspaces of V defined by C7 = {(m, 0) : m G 17} and
W = {(0, w): w G W).
(i) Show that U is isomorphic to U under the correspondence u <r> (m, 0), and that W is iso
morphic to W under the correspondence w ■*> (0, w).
(ii) Show that dim V = dim V + dim W.
5.95. Suppose V = U ® W. Let V be the external direct product of U and W. Show that V is isomorphic
to y under the correspondence v = u + w <r^ {u,w).
Answers to Supplementary Problems
5.52. (i) no, (ii) yes, (iii) yes, (iv) no, (v) yes, (vi) no, (vii) yes, (viii) no.
5.53. (i) dependent, (ii) independent.
5.54. (i) dependent, (ii) independent.
5.55. (i) independent, (ii) dependent.
5.57. (i) (2,l + t) = (l + i)(lt,i); (ii) (7, l + 2\/2) = (3v^)(3 + \/2, 1 + /2).
5.59. The statement is false. Counterexample: u = (1, 0), v — (2, 0) and w = (1, 1) in R*. Lemma 5.2
requires that one of the nonzero vectors u,v,w is a linear combination of the preceding ones. In
this case, v = 2m.
5.64. (i) yes, (ii) no, (iii) no, (iv) yes.
5.65. (i) no, (ii) yes, (iii) no, (iv) no.
5.66. (i) dim W = 3, (ii) dim W = 2.
5.67. dim W = 2.
5.68. dim Ty = 2.
5.69. (i) basis, {(7, 1, 2)}; dim ly = 1. (ii) dim ly = 0. (iii) basis, {(18, 1, 7)}; dim W = 1.
5.70. (i) basis, {(2,1,0,0,0), (4,0,1,1,0), (3,0,1,0,1)}; dim Ty = 3.
(ii) basis, {(2, 1, 0, 0, 0), (1, 0, 1, 0, 0)}; dim ly = 2.
120
BASIS AND DIMENSION
[CHAP. 5
5.71.
5.72.
{
5x + y — z — s =
X + y — z — t =
5.73.
5.75.
5.77.
5.78.
5.79. (i)
(i) basis, {(1, 0, 0, 0), (0, 2, 1, 0), (0, 1, 0, 1)}; dim V = 3.
(ii) basis, {(1,0,0,1), (0,2,1,0)}; dim Tf = 2.
(iii) basis, {(0,2,1,0)}; dim (VnH^) = 1. Hint. VnW must satisfy all three conditions on a,b,c
and d.
(i) yes, (ii) no. For dim V = n + 1, but the set contains only n elements.
dimd/nW') = 2, 3 or 4.
dim {U+W) = 3, dim (UnW)^ 2.
dim (U+W) = 3, dim (t/n W) = 1.
 t =
[30: +
\4x +
— z
2y + s =0
(ii) {(1, 2, 5, 0, 0), (0, 0, 1, 0, 1)}. dim {VnW) = 2
\Ax + 2y
X^x + 2j/ +
8 =
2 + t =
5.80. (i) M = (5/3, 4/3), (ii) [v] = (1, 2), (iii) M = (0, 3), (iv) [v] = ((a + 6)/3, (a26)/3).
5.81. (i) M = (2, 5, 7, 2), (ii) [v] = (0, 4, 1, 0), (iii) [v] = (a + h + c + d, b2c3d, c + 3d, d).
5.82. (i) [A] = (2, 1, 1), (ii) [A] = (3, 1, 2).
/I 1\
5.83. (i) [v], = (3, 1, 2), M; = (4, 1, 2); (ii) P = 1 1 3 .
5.86. (i) 3, (ii) 2, (iii) 3, (iv) 2.
5.88. (i) A
(ii) A
(J P'
(o o)'
B
CI I)
c :)
fl o\ /o 1 o\ /o l\
5.89. <;io 0,1 0,0
\0 0/ \o 0/ \l 0/
<""^=a i> "h: °)
0\ /O 0'
1 0,0 1 ,
,0 0/ \0 1 0;
5.90.
l\ /O 0^
0,0 1
1 0/ \o 1 o;
5.93. Hint. The proof is identical to that given in Problem 5.48, page 113, for a special case (when V is
an extension field of K).
chapter 6
Linear Mappings
MAPPINGS
Let A and 5 be arbitrary sets. Suppose to each aGA there is assigned a unique ele
ment of B; the collection, /, of such assignments is called a function or mapping (or map)
from A into B, and is written
f:A^B or A^B
We write f{a), read "/ of a", for the element of B that / assigns to a e A; it is called the
value of fata or the image of a under /. If A' is any subset of A, then /(A') denotes the set
of images of elements of A'; and if B' is any subset of B, then f'{B') denotes the set of
elements of A each of whose image lies in B':
f{A') = {/(a) : aGA'} and f'W) = {aGA: f{a) G B'}
We call f{A') the imxtge of A' and /^(jB') the inverse image or preimage of B'. In particular,
the set of all images, i.e. f{A), is called the image (or: ranflre) of /. Furthermore, A is called
the domain of the mapping f:A^B, and B is called its codomain.
To each mapping f:A^B there corresponds the subset of A x B given by
{{a, f{a)) : aGA}. We call this set the graph of /. Two mappings f:A*B and g:A^B
are defined to be equal, written f = g, if /(a) = 5r(a) for every aGA, that is, if they have
the same graph. Thus we do not distinguish between a function and its graph. The nega
tion of f = g is written f ¥ g and is the statement: there exists an aGA for which
f(a) ^ g{a).
Example 6.1 : Let A = {a, b, e, d} and B
f from A into B:
{x, y, z, w}. The following diagram defines a mapping
Here f(a) = y, /(6) = x, f{c) = z, and f(d) = y. Also,
faa,b,d}) = {f (a), fib), fid)} = {y,x,y} = {x,y}
The image (or: range) of / is the set [x,y,z}: f(A) = {x,y,z}.
Example 6.2: Let /:R » R be the mapping which assigns to each real number x its square a;^:
X V^ x^ or f{x) = a;2
Here the image of —3 is 9 so we may write /(— 3) = 9.
121
122
LINEAR MAPPINGS
[CHAP. 6
We use the arrow i^ to denote the image of an arbitrary element x eA under a mapping
/:A>5 bywriting x ^ fix)
as illustrated in the preceding example.
Example 6.3: Consider the 2 X 3 matrix A =
c
If we write the vectors in R^ and
'1 3 5'
,2 4 1,
R2 as column vectors, then A determines the mappmg T : RS > R2 defined by
V \* Av, that is, T{v)  Av, v& R3
3\ /I 3 5\/ ^^
1 , then T{v) = Av = ( 2 4 _i  ^
2/ ^ \2/
Thus if V =
<
10
12
Remark:
Every mxn matrix A over a field K determines the mapping T : X" » K™
defined by v v^ Av
where the vectors in if" and li:"' are written as column vectors. For convenience
we shall usually denote the above mapping by A, the same symbol used for the
matrix.
Example 6.4:
Example 6.5:
Example 6.6:
Let V be the vector space of polynomials in the variable t over the real field R.
Then the derivative defines a mapping D:V^V where, for any polynomial / G V,
we let D(f) = df/dt. For example, D(3t^  5t + 2) = 6t  5.
Let V be the vector space of polynomials in t over R (as in the preceding example).
Then the integral from, say, to 1 defines a mapping ^ : V * R where, for any
polynomial f&V, we let ^(/) = f f(t) dt. For example,
^(3(25* + 2) = r (3t^5t + 2)dt = i
Note that this map is from the vector space V into the scalar field R, whereas the
map in the preceding example is from V into itself.
Consider two mappings f:A^B and g.B^C illustrated below:
0.
©— ^©
Let a G A; then /(a) G B, the domain of g. Hence we can obtain the image of /(a)
under the mapping g, that is, g(f{a)) This map
a K g{f(a))
from A into C is called the composition or product of / and g, and is denoted by
g°f. In other words, (g°f):A^C is the mapping defined by
{9°f){a) = g(f(o))
Our first theorem tells us that composition of mappings satisfies the associative law.
Theorem 6.1: Let f.A^B, g.B^C and h.C^D. Then ho{gof) = (hog)of.
We prove this theorem now. If a G A, then
{ho{gof)){a) = h{igof){a)) = h{g{f{a)))
and ({h°g)of){a) = {hog){f{a)) = Hg{f{a)))
Thus iho{gof)){a) = {ihog)of){a) for every a G A, and so ho{gof) = {hog)of.
Remark: Let F:A^B. Some texts write aF instead of F{(i) for the image of a G A
under F. With this notation, the composition of functions F:A*B and
G.B^C is denoted by F o G and not by G o F as used in this text.
CHAP. 6]
LINEAR MAPPINGS
123
We next introduce some special types of mappings.
Definition:
A mapping f:A*B is said to be onetoone (or oneone or 11) or injective
if different elements of A have distinct images; that is,
or, equivalently.
if a v^ a' implies /(a) ¥ f{a')
if /(a) = f{a') implies a = a'
Definition:
A mapping f:A^B is said to be onto (or: / maps A onto B) or surjective if
every b G B is the image of at least one a G A.
A mapping which is both oneone and onto is said to be bijective.
Example 6.7: Let /:R^R, g:B,*R and fc : R » B be defined by f(x)  2'', g{x) — x^  x and
h(x) = x^. The graphs of these mappings follow:
f(x) = 2=^
g(x) = X'
h(x) = a;2
The mapping / is oneone; geometrically, this means that each horizontal line does
not contain more than one point of /. The mapping g is onto; geometrically, this
means that each horizontal line contains at least one point of g. The mapping h
is neither oneone nor onto; for example, 2 and —2 have the same image 4, and —16
is not the image of any element of R.
Example 6.8: Let A be any set. The mapping /:A»A defined by f{a) = a, i.e. which assigns
to each element in A itself, is called the identity mapping on A and is denoted by
1^ or 1 or /.
Example 6.9: Let f:A*B. We call g.B^A the inverse of /, written /~i, if
f°g = Ifl and g°f = 1a
We emphasize that / has an inverse if and only if / is both onetoone and onto
(Problem 6.9). Also, if 6GB then /"'(ft) = a where a is the unique element of A
for which f(a) = 6.
LINEAR MAPPINGS
Let V and U be vector spaces over the same field K. A mapping F:V * U is called a
linear mapping (or linear transformation or vector space komomorphism) if it satisfies the
following two conditions:
(1) For any v,wGV, F{v + w) = F{v) + F{w).
(2) For any kGK and any vGV, F{kv) = kF{v).
In other words, F:V^ U is linear if it "preserves" the two basic operations of a vector
space, that of vector addition and that of scalar multiplication.
Substituting k = into (2) we obtain F{0) = 0. That is, every linear mapping takes
the zero vector into the zero vector.
124
LINEAR MAPPINGS [CHAP. 6
Now for any scalars a,b gK and any vectors v,w eV we obtain, by applying both
conditions of linearity,
F{av + bw) = F{av) + F{bw) = aF{v) + bF{w)
More generally, for any scalars aiGK and any vectors viGV we obtain the basic
property of linear mappings:
F{aiVi + aiVi + • • • + anV„) = aiF{vi) + a^FiVi) + • • • + anF(Vn)
We remark that the condition F{av + hw) = aF{v) + bF{w) completely characterizes
linear mappings and is sometimes used as its definition.
Example 6.10: Let A be any m X % matrix over a field K. As noted previously, A determines a
mapping T:&^K^ by the assignment v y> Av. (Here the vectors in K" and K""
are written as columns.) We claim that T is linear. For, by properties of matrices,
T(v + w)  A(v + w) = Av + Aw = T{v) + T{w)
and T(kv) = A(kv) = kAv = kT{v)
where v,w G K^ and kE. K.
We comment that the above type of linear mapping shall occur again and again. In
fact, in the next chapter we show that every linear mapping from one finitedimensional
vector space into another can be represented as a linear mapping of the above type.
Example 611 Let i^ : R^ ^ B» be the "projection" mapping into the xy plane: F(x, y, z) = {x, y, 0).
We show that F is linear. Let v = (a, b, c) and w = {a', b', c'). Then
F{v + w) = F{a + a',b + 6', c + c') = {a + a', b + 6', 0)
= (o, b, 0) + (a', b', 0) = F(v) + F(w)
and, for any fc € R,
F(kv) = F{,ka, kb, ke) = (ka, kb, 0) = k(a, b, 0) = kF(v)
That is, F is linear.
Example 612 Let F : R2 ^ R2 be the "translation" mapping defined by F(x,y) = (x + l,y + 2)
Observe that F(0) = F(0,0) = (1,2) ^ 0. That is, the zero vector is not mapped
onto the zero vector. Hence F is not linear.
Example 6.13: Let F.V^U be the mapping which assigns G [7 to every veV. Then, for
any v,weV and any kG K, we have
F{v + w) = = + = F(v) + F(w) and F(kv) = = fcO = kF(v)
Thus F is linear. We call F the zero mapping and shall usually denote it by 0.
Example 6.14: Consider the identity mapping I.V^V which maps each vGV into itself . Then,
for any v,wGV and any a,beK, we have
/(av + bw) = av + bw = al{v) + bl(w)
Thus / is linear.
Example 6.15: Let V be the vector space of polynomials in the variable t over the real field R
Then the differential mapping D:V^V and the mtegral mappmg JtV^B
defined in Examples 6.4 and 6.5 are linear. For it is proven in calculus that for any
u,v SV and fe e R,
d{u + v) du , dv djku) j^du
dt = dt+dl ^""^ dt "dt
that is, D(u + v)=D(u) + D(v) and D(ku) = k D(u); and also,
f {u(t) + v(t)) dt = \ u(t) dt + \ v(t) dt
and f ku{t)dt = k \ u{t) dt
that is, J{u + •») = SM + SM and ^(few) = k^iu).
CHAP. 6] LINEAR MAPPINGS 125
Example 6.16: Let F:V ^ U be a linear mapping which is both oneone and onto. Then an inverse
mapping F"! : C/ > y exists. We will show (Problem 6.17) that this inverse map
ping is also linear.
When we investigated the coordinates of a vector relative to a basis, we also introduced
the notion of two spaces being isomorphic. We now give a formal definition.
Definition: A linear mapping F:V^U is called an isomorphism if it is onetoone. The
vector spaces V, U are said to be isomorphic if there is an isomorphism of
V onto U.
Example 6.17: Let y be a vector space over K of dimension n and let {e^, . . .,e„} be a basis of V.
Then as noted previously the mapping v K [v]^, i.e. which maps each v eV into
its coordinate vector relative to the basis {e}, is an isomorphism of V onto K".
Our next theorem gives us an abundance of examples of linear mappings; in particular,
it tells us that a linear mapping is completely determined by its values on the elements
of a basis.
Theorem 6.2: Let V and U be vector spaces over a field K. Let {vi,'i;2, . . .,Vn} be a basis
of V and let Ui, Ui, . . .,Un be any vectors in V. Then there exists a unique
linear mapping F:V^U such that F{vi) = Ui, F{v2) = 112, ..., F{vn) = th.
We emphasize that the vectors Mi, . . . , zt„ in the preceding theorem are completely ar
bitrary; they may be linearly dependent or they may even be equal to each other.
KERNEL AND IMAGE OF A LINEAR MAPPING
We begin by defining two concepts.
Definition: Let F:V^U be a linear mapping. The image of F, written Im F, is the set
of image points in U:
Im F = {uGU : F{v) = u for some v G V}
The kernel of F, written KerF, is the set of elements in V which map into
OGU:
Ker F = {vGV: F{v) = 0}
The following theorem is easily proven (Problem 6.22).
Theorem 6.3: Let F:V>U be a linear mapping. Then the image of /^ is a subspace
of U and the kernel of /?" is a subspace of V.
Example 6.18: Let F:R3^H^ be the projection map
ping into the xy plane: F(x, y, z) —
(x, y, 0). Clearly the image of F is the
entire xy plane:  l  l llill • ■» = (a, 6, c)
Im F = {(a, 6, 0) : o, b G R}
Note that the kernel of F is the z axis:
KerF = {(0, 0, c): c G R}
since these points and only these points
map into the zero vector = (0, 0, 0).
128
LINEAR MAPPINGS [CHAP. 6
Theorem 5.11: The dimension of the solution space W of the homogeneous system of linear
equations AX = is nr where n is the number of unknowns and r is the
rank of the coefficient matrix A.
OPERATIONS WITH LINEAR MAPPINGS
We are able to combine linear mappings in various ways to obtain new linear mappings.
These operations are very important and shall be used throughout the text.
Suppose F:V*U and G:V^U are linear mappings of vector spaces over a field K.
We define the sum F + G to he the mapping from V into U which assigns F(v) + G{v) to
^^^' (F + G){v) = F{v) + Giv)
Furthermore, for any scalar kGK, we define the product kF to be the mapping from V
into U which assigns k F{v) to i; e F:
ikF)iv) = kF{v)
We show that if F and G are linear, then i^^ + G and kF are also linear. We have, for any
vectors v,w GV and any scalars a,h GK,
{F + G){av + bw) = F{av + bw) + Giav + bw)
= aF{v) + bF{w) + aG(v) + bG{w)
= a{Fiv) + G{v)) + b(F{w) + G{w))
= a(F + G){v) + b{F + G){w)
and {kF)(av + bw) = kF{av + bw) = k{aF{v) + bF{w))
= akF{v) + bkF(w) = a(kF)(v) + b(kF){w)
Thus F + G and kF are linear.
The following theorem applies.
Theorem 6.6: Let V and U be vector spaces over a field K. Then the collection of all
linear mappings from V into U with the above operations of addition and
scalar multiplication form a vector space over K.
The space in the above theorem is usually denoted by
Hom(7, [/)
Here Hom comes from the word homomorphism. In the case that V and U are of finite
dimension, we have the following theorem.
Theorem 6.7: Suppose dim 7 = m and dim U ^ n. Then dim Hom(V, U) = mn.
Now suppose that V, U and W are vector spaces over the same field K, and that F:V*U
and G:U^W are linear mappings:
Recall that the composition function Goi?' is the mapping from V into W defined by
{GoF){v) = G{Fiv)). We show that GoF is linear whenever F and G are linear. We have,
for any vectors v,w gV and any scalars a,b GK,
{GoF)iav + bw) = G{Fiav + bw)) = G{aF{v) + bFiw))
= aG{F{v)) + bGiF{w)) = aiGoF){v) + b{GoF)(w)
That is, G o F is linear.
CHAP. 6] LINEAR MAPPINGS
129
The composition of linear mappings and that of addition and scalar multiplication are
related as follows:
Theorem 6.8: Let V, U and W be vector spaces over K. Let F, F' be linear mappings from
V into U and G, G' linear mappings from U into W, and let k&K. Then:
(i) Go(F + F') = GoF + GoF'
(ii) (G + G')oF = GoF + G'oF
(iii) k{GoF) = {kG)oF = Go(kF).
ALGEBRA OF LINEAR OPERATORS
Let F be a vector space over a field K. We novir consider the special case of linear map
pings T:V^V, i.e. from V into itself. They are also called linear operators or linear
transformations on V. We will write AiV), instead of Horn (V, V), for the space of all such
mappings.
By Theorem 6.6, A{V) is a vector space over K; it is of dimension n^ if V is of dimension
n. Now if T,SgA{V), then the composition SoT exists and is also a linear mapping
from V into itself, i.e. SoTgA(V). Thus we have a "multiplication" defined in A{V).
(We shall write ST for SoT in the space A{V).)
We remark that an algebra A over a field X is a vector space over K in which an opera
tion of multiplication is defined satisfying, for every F.G,H GA and every kGK,
(i) F{G + H) = FG + FH
(ii) (G + H)F = GF + HF
(iii) k{GF) = {kG)F = G(kF).
If the associative law also holds for the multiplication, i.e. if for every F,G,H gA,
(iv) {FG)H = F{GH)
then the algebra A is said to be associative. Thus by Theorems 6.8 and 6.1, A{V) is an
associative algebra over K with respect to composition of mappings; hence it is frequently
called the algebra of linear operators on V.
Observe that the identity mapping / : 7 > F belongs to A{V). Also, for any T G A{V),
we have TI  IT  T. We note that we can also form "powers" of T; we use the notation
T^^ToT,T^ = ToToT, .... Furthermore, for any polynomial
V(x) = tto + aix + a2X^ + • • ■ + a„x", aiGK
we can form the operator p{T) defined by
p{T) = aol + aiT + a^T^ + • • ■ + a„r"
(For a scalar kGK, the operator kl is frequently denoted by simply k.) In particular, if
V{T) = 0, the zero mapping, then T is said to be a zero of the polynomial p{x).
Example 6.21: Let T : R3 ^ R3 be defined by T(x,y,z) = (0,x,y). Now if {a,b,c) is any element
of R3, then:
{T + I)(a, b, c) = (0, a, b) + (a, b, c) = (a,a+b,b + c)
and T^(a, b, c) = T^0, a, b) = T{0, 0, a) = (0, 0, 0)
Thus we see that rs = 0, the zero mapping from V into itself. In other words,
T is a zero of the polynomial p{x) = v?.
130 LINEAR MAPPINGS [CHAP. 6
INVERTIBLE OPERATORS
A linear operator T .V^ V is said to be invertible if it has an inverse, i.e. if there
exists ri e A{V) such that TT^ = T^T = I.
Now T is invertible if and only if it is oneone and onto. Thus in particular, if T is
invertible then only gV can map into itself, i.e. T is nonsingular. On the other hand,
suppose T is nonsingular, i.e. Ker T = {0}. Recall (page 127) that T is also oneone. More
over, assuming V has finite dimension, we have, by Theorem 6.4,
dimF = dim(Imr) + dim (Ker T) = dim(Imr) + dim({0})
= dim (Im T) + = dim (Im T)
Then ImT V, i.e. the image of T is V; thus T is onto. Hence T is both oneone and onto
and so is invertible. We have just proven
Theorem 6.9: A linear operator T:V*V on a vector space of finite dimension is in
vertible if and only if it is nonsingular.
Example 6.22: Let T be the operator on R2 defined by T(x, y) = (y, 2xy). The kernel of T is
{(0, 0)}; hence T is nonsingular and, by the preceding theorem, invertible. We now
find a formula for Ti. Suppose (s, t) is the image of {x, y) under T\ hence (a;, y)
is the image of (s, «) under ri; T(x,y) = (s,t) and T'^(s, t) = (x, y). We have
T(x, y)  (y, 2x — y) = (s, t) and so y = s, 2xy = t
Solving for x and y in terms of s and t, we obtain a; = is + li, 2/ = s. Thus T^'
is given by the formula T~^(s, t) = (s + it, s).
The finiteness of the dimensionality of V in the preceding theorem is necessary as seen
in the next example.
Example 6.23: Let V be the vector space of polynomials over K, and let T be the operator on V
defined by
T(ao + ait\ + ajn) = a^t + Uit^ + ■ • • + a„<» + »
i.e. T increases the exponent of t in each term by 1. Now T is a linear mapping
and is nonsingular. However, T is not onto and so is not invertible.
We now give an important application of the above theorem to systems of linear
equations over K. Consider a system with the same number of equations as unknowns,
say n. We can represent this system by the matrix equation
Ax = b (*)
where A is an nsquare matrix over K which we view as a linear operator on K". Suppose
the matrix A is nonsingular, i.e. the matrix equation Ax = has only the zero solution.
Then, by Theorem 6.9, the linear mapping A is onetoone and onto. This means that the
system (*) has a unique solution for any b G K". On the other hand, suppose the matrix
A is singular, i.e. the matrix equation Ax = has a nonzero solution. Then the linear
mapping A is not onto. This means that there exist b G K" for which (*) does not have a
solution. Furthermore, if a solution exists it is not unique. Thus we have proven the
following fundamental result:
Theorem 6.10: Consider the following system of linear equations:
anXi + ai2X2 4 • • • + amajn = bi
a2lXl + a22X2 + • • • + a2nXn = b2
a„lXl + an2X2 + • • ■ + annXn = &«
CHAP. 6]
LINEAR MAPPINGS
131
(i) If the corresponding homogeneous system has only the zero solution,
then the above system has a unique solution for any values of the bu
(ii) If the corresponding homogeneous system has a nonzero solution, then:
(i) there are values for the 6i for which the above system does not have
a solution; (ii) whenever a solution of the above system exists, it is
not unique.
Solved Problems
MAPPINGS
6.1. State whether or not each diagram defines a mapping from A = {a, b, c) into
B= {x,y,z}.
(i) (ii)
(i) No. There is nothing assigned to the element 6 G A.
(ii) No. Two elements, x and z, are assigned to c G A.
(iii) Yes.
v(iii)
6.2. Use a formula to define each of the following functions from R into R.
(i) To each number let / assign its cube.
(ii) To each number let g assign the number 5.
(iii) To each positive number let h assign its square, and to each nonpositive number
let h assign the number 6.
Also, find the value of each function at 4, —2 and 0.
(i) Since / assigns to any number z its cube a^, we can define / by /(») = a:^. Also:
/(4) = 43 = 64, /(2) = (2)3 = 8, /(O) = 0^ =
(ii) Since g assigns 5 to any number x, we can define g by g{x) = 5. Thus the value of g at each
number 4, —2 and is 5:
flr(4) = 5, fl(2)  5, ff(0) = 5
(iii) Two different rules are used to define h as follows:
'a;2 if X >
6 if a; =s
Since 4 > 0, h(4) = 42 = 16. On the other hand, 2, ^ and so ^2) = 6, fe(0) = 6.
h{x) =
132
LINEAR MAPPINGS
[CHAP. 6
6^. Let A = {1,2,3,4,5} and let f:A^A be the map
ping defined by the diagram on the right, (i) Find
the image of /. (ii) Find the graph of /.
(i) The image /(A) of the mapping / consists of all the points
assigned to elements of A. Now only 2, 3 and 5 appear as
the image of any elements of A; hence /(A) = {2,3,5}.
(ii) The graph of / consists of the ordered pairs (a, /(a)),
where a&A. Now /(I) = 3, /(2) = 5, /(3) = 5, /(4) = 2,
/(5) = 3; hence the graph of
/ = {(1,3),(2,5), (3,5), (4,2),(5,3)}
6.4. Sketch the graph of (i) f{x) ^x^ + x6, (ii) g{x) = x^^x^x + Z.
Note that these are "polynomial functions". In each case set up a table of values for x and
then find the corresponding values of f{x). Plot the points in a coordinate diagram and then draw
a smooth continuous curve through the points.
(i)
X
m
4
6
3
2
4
1
6
6
1
4
2
3
6
(ii)
X
9{x)
2
15
1
3
1
2
3
3
4
15
6.5. Let the mappings f.A^B and g.B^C be defined by the diagram
A f B g C
(i) Find the composition mapping {gof):A*C. (ii) Find the image of each map
ping: f,g and go f.
(i) We use the definition of the composition mapping to compute:
(gofXa) = g(f(a)) = g(y) = t
ig°f)ib) = gim) = g(x) = s
(9° me) = g(f(c)) = g{y) = t
Observe that we arrive at the same answer if we "follow the arrows" in the diagram:
a * y * t, b * X * s, e ^ y * t
CHAP. 6] LINEAR MAPPINGS I33
(ii) By the diagram, the image values under the mapping / are x and y, and the image values under
g are r, s and t\ hence
image of / = {x, y} and image oi g = {r, s, t}
By (i), the image values under the composition mapping gof are t and s; hence image of
0°f = {s, *}• Note that the images of g and g°f are different.
6.6. Let the mappings / and g be defined by f{x) = 2x + 1 and g{x) = x^2. (i) Find
(sro/)(4) and (/osr)(4). (ii) Find formulas defining the composition mappings gof
and fog.
(i) /(4) = 2^4 + 1 = 9. Hence (ff °/)(4) = p(/(4)) = ^(9) = 92  2 = 79.
g(A) = 42  2 = 14. Hence (/ofl')(4) = /(fir(4)) = /<14) = 2 • 14 + 1 = 29.
(ii) Compute the formula for g o f as follows:
(ff°f){x) = g(f{x)) = fif(2a; + l) = (2a; + 1)2  2 = 4a;2 + 4a;  1
Observe that the same answer can be found by writing y = f(x) = 2a! + 1 and z = g(y) =
j/2  2, and then eliminating y: z  y^2 = {2x¥ 1)^ 2  4x^ + 4a!  1.
(f°g)(«) ^ f{g(x)) = /(a;22) = 2(a;22) + 1 = 2a!2  3. Observe that fog^g°f.
6.7. Let the mappings f:A^B,g:B^C and h:C*D be defined by the diagram
A f B g c h D
a
Determine if each mapping (i) is oneone, (ii) is onto, (iii) has an inverse.
(i) The mapping / : A > B is oneone since each element of A has a different image. The mapping
g.B>C is not oneone since x and z both map into the same element 4. The mapping h:C * D
is oneone.
(ii) The mapping f .A^B is not onto since « e B is not the image of any element of A. The
mapping g:B*C is onto since each element of C is the image of some element of B. The
mapping h.C^D is also onto.
(iii) A mapping has an inverse if and only if it is both oneone and onto. Hence only h has an
inverse.
6.8. Suppose f:A*B and g.B^C; hence the composition mapping {gof):A^C exists.
Prove the following, (i) If / and g are oneone, then gof is oneone. (ii) If / and g
are onto, then gof is, onto, (iii) If gof is, oneone, then / is oneone. (iv) If gof is
onto, then g is onto.
(i) Suppose (g°f)(x) = (g°f){y). Then g(f{x)) = g{f(y)). Since g is oneone, f{x) = f(y). Since /
is oneone, x = y. We have proven that {g ° f){x) = (g ° f){y) implies x = y\ hence flro/ is
oneone.
(ii) Suppose c&C. Since g is onto, there exists h e. B for which g(h) = c. Since / is onto, there
exists ae.A for which f(a)  b. Thus (g '' f){a) = g(f(a)) = g(b) = c; hence flr o / is onto.
(iii) Suppose / is not oneone. Then there exists distinct elements x,y G A for which /(a;) = f{y).
Thus {g°f)(x) = g{f{x)) = g(f(y)) = (g ° f)(y); hence g°f is not oneone. Accordingly it g°f
is oneone, then / must be oneone.
(iv) If aGA, then (g ° f){a) = g(f{a)) G g{B); hence (g° f){A) C g(B). Suppose g is not onto.
Then g(B) is properly contained in C and so (g ° f)(A) is properly contained in C; thus g°f is
not onto. Accordingly Hgofia onto, then g must be onto.
134 LINEAR MAPPINGS [CHAP. 6
6.9. Prove that a mapping f:A*B has an inverse if and only if it is onetoone and onto.
Suppose / has an inverse, i.e. there exists a function /i : B ^ A for which f~^°f = 1a and
/0/1 = ig. Since 1a is onetoone, / is onetoone by Problem 6.8(iii); and since 1b is onto, / is
onto by Problem 6.8(iv). That is, / is both onetoone and onto.
Now suppose / is both onetoone and onto. Then each 6 S B is the image of a unique element
in A, say 6. Thus if f{a) = b, then a = b; hence /(6) = b. Now let g denote the mapping from
A.
B to A defined by 6 1^6. We have:
(i) («? ° /)(<»)  fir(/(o)) = S(b) = 6 = a, for every a G A; hence s ° / = 1a
(ii) {f°ff)(b) = f{g{b))  fib) = 6, for every 6 G B; hence f°g Ib
Accordingly, / has an inverse. Its inverse is the mapping g.
6.10. Let / : R ^ R be defined by fix) = 2xS. Now / is onetoone and onto; hence /
has an inverse mapping /"^ Find a formula for f~^.
Let y be the image of x under the mapping f: y  f(x) =2xS. Consequently x will be the
image of y under the inverse mapping fK Thus solve for x in terms of y in the above equation:
X = {y + 3)/2. Then the formula defining the inverse function is fHy) = {y + 3)/2.
LINEAR MAPPINGS
6.11. Show that the following mappings F are linear:
(i) F : R2 ^ R2 defined by F{x, y)=^{x + y, x).
(ii) F:R^*R defined by F{x,y,z) = 2xSy + 4z.
(i) Let v = (a,b) and w = (a',b'); hence
V + w  (a + a',b + b') and kv = {ka, kb), k&R
We have F(v) = (a t 6, a) and F(w) = (a' + b', a'). Thus
F(v + w) = F(a + a',h + b')  (a.\ a.' + h+b' , a+ a')
= (a + 6, a) + (a' + b', a') = F{v) + F(w)
and F(kv)  F(ka, kb) := (ka + kb, ka) = k(a + b,a) = kF{v)
Since v, w and A; were arbitrary, F is linear.
(ii) Let v(a,b,c) and w = (a',b',c'); hence
V + w = (a + a',b + b',c + e') and kv  (ka, kb, kc), fc e E
We have F(v) = 2a  36 I 4c and F(w) = 2a'  36' t 4c'. Thus
F(v + w) = F(a + a',b + b',c + c') = 2(a 1 a')  3(6 + 6') h 4(c I c')
= (2a  36 + 4c) + (2a'  36' 4 4c') = F(v) + F(w)
and F(kv) = F(ka, kb, kc) = 2ka  3kb + 4kc = k(2a  36 + 4c) = kF(v)
Accordingly, F is linear.
6.12. Show that the following mappings F are not linear:
(i) F : R2 ^ R defined by F(x, y) = xy.
(ii) FrB?^B? defined by F{x, y) = {x + 1, 2y, x + y).
(iii) F:W^B? defined by F{x, y, z) = (\x\, 0).
(i) Let ■w = (l,2) and w = (3,4); then v + w = (A,6).
We have F(v) = 1*2 = 2 and F(w) = 3 • 4 = 12. Hence
CHAP. 6] LINEAR MAPPINGS 135
F(v + w) = F(4, 6) = 4 • 6 = 24 ^ F{v) + F{w)
Accordingly, F is not linear.
(ii) Since F{0, 0) = (1, 0, 0) ^ (0, 0, 0), F cannot be linear.
(iii) Let v = (1, 2, 3) and k = 3; hence kv = (3, 6, 9).
We have F{v) = {1,0) and so kF (v) = S(1,0) = {S,0). Then
Fikv) = F{3, 6, 9) = (3, 0) # fcF('y)
and hence F is not linear.
6.13. Let V be the vector space of nsquare matrices over K. Let M be an arbitrary matrix
in F. Let r : F * 7 be defined by T{A) = AM + MA, where AeV. Show that
T is linear.
For any A,BGV and any k G K, we have
T{A +B) = (A + B)M + M{A + B) = AM + BM + MA + MB
= {AM + MA) + {BM + MB) = T{A) + T{B)
and T{kA) = {kA)M + M{kA) = k{AM) + k{MA) = k{AM + MA) ^ kT{A)
Accordingly, T is linear.
6.14. Prove Theorem 6.2: Let V and U be vector spaces over a field K. Let {^'i, ...,■?;„} be
a basis of V and let Mi, . . . , «„ be any arbitrary vectors in U. Then there exists a
unique linear mapping F:V^U such that F{Vi) = Ui, F{v2) = %2, . . ., F{Vn) = Un.
There are three steps to the proof of the theorem: (1) Define a mapping F .V ^ U such that
F{v^ = Mi, i = l, ...,n. (2) Show that F is linear. (3) Show that F is unique.
Step {1). Let V eV. Since {v^, . . .,i;„} is a basis of V, there exist unique scalars a^, . . .,a„ G X
for which v = a^v^ + a^v^ ] + a„i;„. We define F:V ^ U by F{v) = a^Ui + a^u^ \ h a„M„.
(Since the osj are unique, the mapping F is welldefined.) Now, for i= \, ..., n,
Vi  Ovi + ■ ■ ■ + Ivi + • ■ ■ + Ov„
Hence F{Vi) = Omj + • • • + Imj + • • ■ + Om„ = m;
Thus the first step of the proof is complete.
Step {2). Suppose v = a^Vi + a^v^ + • • • + a„'U„ and w = b^Vi + h^Vi + • • • + h^v^. Then
V + w = (tti + hi)Vi + {a2 + b2)v2 + ■ • • + (a„ + 6„)v„
and, for any kG K, kv = ka^v^ + kazVz + • • ■ + ka^v^. By definition of the mapping F,
F{v) = OiMi + a^u^ + • ■ • + a„M„ and F{w) = h^u^ + h^Vi + • ■ • + 6„m„
Hence F{v + w) = {a^ + hi)ui + {a^ + 62)^2 +•••+(«„ + 6„)m„
= («!«, + ajMj + • • • + a„M„) + (6iMi + 62M2 + • • • + 6„M„)
= F{v) + F(w)
and f (fci;) = fc(oiMi + O2M2 + • • • + o„mJ = ^^(1;)
Thus f is linear.
Step (3). Now suppose G:V ^V is linear and G{v^ = M;, t = 1, . . ., m. If v = Oi^i + a^v^ +
1 a^nVn, then
G(t)) = G(aiVi + a^v^ + • • • + a„v„)  a^G{v^ + a2G{v^ + • • • + a„G(v„)
= OiMj + a2M2 + • • • + «„«*„ = ii'(t))
Since G(t)) = F{v) for every i? G V, G = F. Thus F is unique and the theorem is proved.
136 LINEAR MAPPINGS [CHAP. 6
6.15. Let r : R2 » R be the linear mapping for which
r(l, 1) = 3 and r(0, 1) = 2 {1)
(Since {(1,1), (0,1)} is a basis of R^, such a linear mapping exists and is unique by
Theorem 6.2.) Find T{a, b).
First we write (a, 6) as a linear combination of (1, 1) and (0, 1) using unknown scalars x and y:
{a, b) = x{l, 1) + j/(0, 1) (2)
Then (a, 6) = (x, x) + (0, y) = (x,x + y) and so xa, x + y=^b
Solving for x and y in terms of a and b, we obtain
X = a and y = b — a (3)
Now using (1) and (2) we have
T(a, b) = T{x(l, 1) + 3/(0, D) = xT{l, 1) + yT(0, 1) = 3x  2y
Finally, using (3) we have T(a, b)  3x  2y  3(a)  2(6  a) = 5a  26.
6.16. Let T .V^U be linear, and suppose Vi, ...,Vn&V have the property that their
images T{vi), .... T{vn) are linearly independent. Show that the vectors vi, . . .,Vn
are also linearly independent.
Suppose that, for scalars ai, . . . , o„, a^v^ + ag^a + • • • + a„Vn = 0. Then
= r(0) = Tia^vi + a.2V2 + h a„i'„) = aiT(vi) + azTiv^) + ■ ■ • + a„r(v„)
Since the T{Vi) are linearly independent, all the O; = 0. Thus the vectors Vi,...,v„ are linearly
independent.
6.17. Suppose the linear mapping F:V^U is onetoone and onto. Show that the inverse
mapping F~^:U *V is also linear.
Suppose M.w'G U. Since F is onetoone and onto, there exist unique vectors v,v'BV for
which F{v) = u and F(v') = u'. Since F is linear, we also have
F{v + v')  F{v) + F(v') = u + u' and F{kv) = kF(v)  ku
By definition of the inverse mapping, Fi(tt) = v. Fi(m') = f', Fi{u\u') = v + v' and
F^ku) = fc'U. Then
FMm + m') = •« + •"' = Fi(m) + FMm') and FHM = fei) = feFHtt)
and thus F"' is linear.
IMAGE AND KERNEL OF LINEAR MAPPINGS
6.18. Let F : R* » R5 be the linear mapping defined by
F{x,y,s,t) = {xy + s + t,x + 2st,x + v + ?>sZt)
Find a basis and the dimension of the (i) image U of F, (ii) kernel W of F.
(i) The images of the following generators of R* generate the image V of F:
F(l, 0,0,0)  (1,1,1) F(0, 0,1,0) = (1,2,3)
F(0, 1, 0, 0) = (1, 0, 1) F(0, 0, 0, 1) = (1, 1, 3)
Form the matrix whose rows are the generators of U and row reduce to echelon form:
to „ . « to
Thus {(1, 1, 1), (0, 1, 2)} is a basis of V; hence dim C/ = 2.
CHAP. 6] LINEAR MAPPINGS 137
(ii) We seek the set of (x, y, s, t) such that F{x, y, s, t) = (0, 0, 0), i.e.,
F(x, y,s,t) = (xy + s + t,x + 2st,x + y + BsSt) = (0, 0, 0)
Set corresponding components equal to each other to form the following homogeneous system
whose solution space is the kernel W of F:
X — y+s+t = X — y+s+t =
X + 2s  t = or y + s  2t = or
y + s  2t =
x + y + 3s3t = 2y + 2s  4t =
The free variables are s and t; hence dim W = 2. Set
(a) s = —1, f = to obtain the solution (2, 1, —1, 0),
(6) s = 0, t = l to obtain the solution (1, 2, 0, 1).
Thus {(2, 1, 1, 0), (1, 2, 0, 1)} is a basis of W. (Observe that dim C/ + dim IT = 2 + 2 = 4,
which is the dimension of the domain R* of F.)
6.19. Let T:W^W be the linear mapping defined by
T{x, y, z) = {x + 2y — z, y + z, X + y — 2z)
Find a basis and the dimension of the (i) image U of T, (ii) kernel W of T.
(i) The images of generators of R^ generate the image U of T:
r(i,o,o) = (1,0,1), r(o,i,o) = (2,i,i), r(o, o, i) = (1, i, 2)
Form the matrix whose rows are the generators of U and row reduce to echelon form:
(" ° '\
to 11 to
1
1
2
1
1
1
1
2
1
1
1
1
1
1
1
1
1
1
Thus {(1, 0, 1), (0, 1, 1)} is a basis of U, and so dim U = 2.
(ii) We seek the set of {x,y,z) such that T(x,y,z) = (0,0,0), i.e.,
T(x, y,z) = {x + 2y  z, y + z, X \ y 2z) = (0, 0, 0)
Set corresponding components equal to each other to form the homogeneous system whose
solution space is the kernel W of T:
X + 2y — z =
x + 2y  z =
X + 2y  z =
y + z = a
or
y + z =
or
y + z =
a; + y — 2z = a
—y — z =
The only free variable is z; hence dim W = 1. Let z — 1; then y = —1 and a; = 3. Thus
{(3, —1, 1)} is a basis of W. (Observe that dim U + dim W = 2 + 1 = 3, which is the dimen
sion of the domain R3 of T.)
6.20. Find a linear map F : R^ ^ R* whose image is generated by (1, 2, 0, —4) and (2, 0, —1, —3).
Method 1.
Consider the usual basis of R^: e^ = (1, 0, 0), eg = (0, 1. 0), eg = (0, 0, 1). Set F(ei) = (1, 2, 0, 4),
F(e2) = (2, 0, —1, —3) and F{eg) = (0, 0, 0, 0). By Theorem 6.2, such a linear map F exists and is
unique. Furthermore, the image of F is generated by the F(ej); hence F has the required property.
We find a general formula for F(x, y, z):
F(x, y, z) = F{xei + ye^ + zeg) = xFie^) + yF{e2) + 2^(63)
= x(\, 2, 0, 4) + 2/(2, 0, 1, 3) + 2(0, 0, 0, 0)
=z (x + 2y, 2x, —y, —4x — 3y)
138 LINEAR MAPPINGS [CHAP. 6
Method 2.
Form a 4 X 3 matrix A whose columns consist only of the given vectors; say,
1 2 2\
A =
2
1
1
4
3
3
Recall that A determines a linear map A : R3 ^ B^ whose image is generated by the columns of A.
Thus A satisfies the required condition.
6.21. Let V be the vector space of 2 by 2 matrices over R and let M =  j . Let
F:V^Y be the linear map defined by F{A^ = AM — MA. Find a basis and the
dimension of the kernel W of F.
We seek the set of (^ ^\ such that Fr' J = (p q) •
K: :)  C X I)  {I DC '
_ /x 2x + 3y\ _ /
\s 2s + St ) \
/2s 2x + 2y 2t\ _ /«
\2s 2s y ~ \0
Thus 2x + 2y 2t = x + y  t =
or
2s = s =
The free variables are y and t; hence dim W — 2. To obtain a basis of W set
(a) y — —1, t = to obtain the solution x = 1, y = —1, s = 0, t = 0;
(6) y — 0, t — 1 to obtain the solution x = 1, y = 0, s = 0, t = 1.
t
X + 2s y + 2t
3s 3(
^^"^{(o o)' G ;)} i« a basis of T^.
6.22. Prove Theorem 6.3: Let F.V^U be a linear mapping. Then (i) the image of F
is a subspace of U and (ii) the kernel of F is a subspace of V.
(i) Since F(Q) = 0, G Im F. Now suppose u, u' GlmF and a,b& K. Since u and u' belong to
the image of F, there exist vectors v,v' GV such that F(v) = u and F(v') = u'. Then
F{av + bv')  aF(v) + hF(v') = au + bu' e Im F
Thus the image of F is a subspace of U.
(ii) Since F(0) = 0, G Ker F. Now suppose v,wG Ker F and a,b e K. Since v and w belong
to the kernel of F, F{v) = and F(w) = 0. Thus
F(av + bw) = aF(v) + bF{w) ==: aO + 60 = and so av + bw S KerF
Thus the kernel of F is a subspace of V.
6.23. Prove Theorem 6.4: Let V be of finite dimension, and let F:V^U be a linear map
ping with image U' and kernel W. Then dim U' + dim W = dim V.
Suppose dim V = n. Since W ia a. subspace of V, its dimension is finite; say, dim W = r — n.
Thus we need prove that dim U' = n — r.
CHAP. 6] LINEAR MAPPINGS 139
Let {wi, . . . , Wr) be a basis of W. We extend {wj to a basis of V:
{w'l Wr,Vi, ...,i;„_J
Let B = {F{Vi),F(v2), ...,F(v„^r)}
The theorem is proved if we show that B is a basis of the image U' of F.
Proof that B generates U'. Let u S U'. Then there exists v &V such that F(v) — u. Since
{Wj, Vj} generates V and since v S V,
V — OjWj + • • • + a,Wr + ^l'"! + • • • + b^r'^nr
where the a„ ftj are scalars. Note that F(Wi) — since the Wj belong to the kernel of F. Thus
u = jF'(t') = F(aiici + • • • + a^Wf + biv^ + • • • + b„^^v„r)
= aiF{wi) + ■■■ + a^(Wr) + b^Fivi) + ■■■ + 6„_^F(i;„_,)
= OjO + • • • + a^O + biF(Vi) + • • ■ + bnrF(Vnr)
= b,F(v,) +■■■+ 6„_,FK_,)
Accordingly, the F{Vf) generate the image of F.
Proof that B is linearly independent. Suppose
a^Fivi) + a.2F(v2) + • • • + a„_ri^K_,.) =
Then F(aiVi + 02^2 + • • • + a^^^v^r) = and so a^Vi + • • • + a„_,T;„_^ belongs to the kernel W
of F. Since {wj generates W, there exist scalars 61, . . . , 6^ such that
a^Vi + a2^'2 + • • • + anr'Un, = b^Wi + 62^2 + • • • + b^Wr
or ail^i + • • • + anr'Wnr — b^Wi — • ■ • — fe^w^ = (*)
Since {tWj, ■«{} is a basis of V, it is linearly independent; hence the coefficients of the W; and Vj in (*)
are all 0. In particular, Oj = 0, . . ., a„_r = 0 Accordingly, the F(v^ are linearly independent.
Thus B is a basis of V, and so dim V — n — r and the theorem is proved.
6.24. Suppose f:V*U is linear with kernel W, and that f{v) = u. Show that the "coset"
V + W = {v + w: w e W} is the preimage of u, that is, f~^{u) — v + W.
We must prove that (i) f~Hu)cv + W and (ii) v + T^ c/i(m). We first prove (i). Suppose
v'GfHu). Then f(v') = u and so f(v'  v) = f(v')  f{v) = uu = 0, that is, v'vGW. Thus
v' = V + (v' — v) €. V + W and hence f~Hu) Cv + W.
Now we prove (ii). Suppose v' G v+W. Then v' = 1; + w where w G W. Since W is the
kernel of /, f(w) = 0. Accordingly, f{v') = /(u + w) = f(v) + f(w) = /(t)) + = f(v) = m. Thus
v' e /i(m) and so v + Wc f^(u).
SINGULAR AND NONSINGULAR MAPPINGS
6.25. Suppose F:V ^U is linear and that V is of finite dimension. Show that V and the
image of F have the same dimension if and only if F is nonsingular. Determine all
nonsingular mappings T : R* ^ R^.
By Theorem 6.4, dim F = dim (Im/f) + dim (Ker/i^). Hence V and ImF have the same di
mension if and only if dim (Ker F) = or KerF = {0}, i.e. if and only if F is nonsingular.
Since the dimension of R^ is less than the dimension of R*, so is the dimension of the image of
T. Accordingly, no linear mapping T : B* » R^ can be nonsingular.
6.26. Prove that a linear mapping F:V*U is nonsingular if and only if the image of
an independent set is independent.
Suppose F is nonsingular and suppose {v^, . . ., v^} is an independent subset of V. We claim that
the vectors F(vi) F(vJ are independent. Suppose aiF{Vi) + a<^{v^ +;•••+ a„F{v„)  0, where
a, e X. Since F is linear, F(ajVi + a^v^, + • • • + o^vj = 0; hence
a^Vy + 021^2 + • • • + On^n ^ Ker F
140 LINEAR MAPPINGS
[CHAP. 6
But F is nonsingular, i.e. Ker F = {0}; hence a^v^ + a^v^ +■■■ + a„v„ = 0. Since the i;; are linearly
independent, all the a^ are 0. Accordingly, the F(v>i are linearly independent. In other words, the
image of the independent set {v^, . . . , i)„} is independent.
On the other hand, suppose the image of any independent set is independent. If v G V is
nonzero, then {v} is independent. Then {F{v)} is independent and so F(v) # 0. Accordingly, F is
nonsingular.
OPERATIONS WITH LINEAR MAPPINGS
6.27. Let F:W^W and G:W^ W be defined by F{x, y, z) = {2x, y + z) and G{x, y, z) =
(x z,y). Find formulas defining the mappings F + G,ZF and 2F  5G.
(F + G)(x, y, z) = F(x, y, z) + G(x, y, z)
= (2x, y + z) + (x — z,y) = (3x z,2v + z)
(3F)(a;, y, z) = ZF(x, y, z) = 3(2*, y + z) = {Qx, By + 3z)
(2F  5G){x, y, z) = 2F(x, y, z)  5G{x, y, z) = 2(2a;, y + z)  5{x  z, y)
= (Ax, 2y + 22) + (5a; + 5z, 5y) = (x + 5z, 2,y + 2z)
6.28. Let F:W^W and G/R'^W be defined by F(x,y,z) = {2x,y + z) and G{x,y) 
{y, x) . Derive formulas defining the mappings G°F and FoG.
(GoF){x,y,z) = G(F{x,y,z)) = G{2x, y + z) = (y + z, 2x)
The mapping F ° G is not defined since the image of G is not contained in the domain of F.
6.29. Show: (i) the zero mapping 0, defined by 0{v) = for every v GV, is the zero ele
ment of Hom(F, U); (ii) the negative of F G Hom(7, U) is the mapping {1)F, i.e.
F = (l)F.
(i) Let F G Hom {V, U). Then, for every v GV,
{F + Q){v) = F(,v) + 0{v) = F{v) + = F(v)
Since (F + 0)(v) = F(v) for every v eV, F + = F.
(ii) For every v G V,
[F + {l)F){v) = F{v) + {l)F{v) = F{v)  F{v} = = 0{v)
Since {F + {l)F){v) = 0(v) for every vGV, F + (l)F = 0. Thus (l)F is the negative
of F.
6.30. Show that for Fi, ...,F„G Hom {V, U) and ai, ...,a„GK, and for any vGV,
{aiFi + a2F2 H + a„F„)(i;) = aiFi{v) + aJFiiv) + ■ ■ • + ajf'niv)
By definition of the mapping aiFj, (a^F^iv) = a^F^{v); hence the theorem holds for n = 1.
Thus by induction,
(aiFi + (I2F2 + • • ■ + a„F„)(i;) = (a^F^)(v) + {a^F^ + • ■ • + a„F„)(i;)
= aiFiCv) + a^F^iv) + • • • + a„F„(D)
6.31. Let /^:R3^R2, G.W^B? and HrR^^R^ be defined by i^'Cx, i/, 2) = {x + y + z,x + y),
G{x, y, z) = {2x + z,x + y) and i?(a;, y, z) = {2y, x). Show that F,G,H G Hom (RS R2)
are linearly independent.
Suppose, for scalars a,b,c G K,
aF + bG + cH = {1)
(Here is the zero mapping.) For e^ = (1, 0, 0) G R3, we have
(aF + bG + cH)(e{) = aF(l, 0, 0) + bG(l, 0, 0) + cH(l, 0, 0)
= a(l, 1) + 6(2, 1) + c(0, 1) = (a + 2b,a + b + c)
CHAP. 6] LINEAR MAPPINGS 141
and 0(ei) = (0, 0). Thus by {!), (a + 2b, a+b + e) = (0, 0) and so
a + 26 = and a + 6 + c = (2)
Similarly for eg = (0, 1, 0) e R3, we have
(aF + bG + cH){e2) = aF(0, 1, 0) + 6G(0, 1, 0) + cH(0, 1, 0)
= a(l, 1) + 6(0, 1) + c(2, 0) = (a+2c, a+6) = 0(62) = (0,0)
Thus a + 2e = and a + 6 = (5)
Using (2) and (5) we obtain a = 0, 6 = 0, c = (■*)
Since (1) implies (4), the mappings F, G and H are linearly independent.
6.32. Prove Theorem 6.7: Suppose dim y = m and dim U = n. Then dim Hom {V, U)  mn.
Suppose {vi, . . .,v„} is a basis of V and {mj, . . .,m„} is a basis of V. By Theorem 6.2, a linear
mapping in Hom {V, V) is uniquely determined by arbitrarily assigning elements of t/ to the basis
elements Vj of V. We define
F^ e Hom {V,U), i = 1, . . . , m, j = 1, ...,n
to be the linear mapping for which Fij{v^ = Uj, and Fij(Vk) 0 for fe # i. That is, Fy maps Vi
into Mj and the other v's into 0. Observe that {Fy} contains exactly mn elements; hence the theorem
is proved if we show that it is a basis of Hom {V, U).
Proof that {Fy} generates Hom (F, U). Let F G Hom {V, U). Suppose F{vi) = w^, F(v2) =
W2, ..., F(Vm) = Wm Since w^ G U, it is a linear combination of the u's; say,
Wk = afclMl + «fc2«*2 + • • • + fflfc„Mn> fc = 1, . . . , m, Oy G X (i)
TTi n
Consider the linear mapping G = 2 2 ayFy Since G is a linear combination of the Fy, the
i=l i=l
proof that {Fy} generates Hom (V, t7) is complete if we show that F = G.
We now compute G(Vk), k = l, ...,m. Since Fy('Ufc) = for k^i and ^^((Vfc) = Mj,
m n n t
G(i;k) = 22 OiiF«('yic) = 2 OfciJ^)cj(vic) = 2 Ofci«j
i=l 3 = 1 3 = 1 3 = 1
= a^iMj + ak2'"2 + • • • + »fcnMn
Thus by (1), G{v^,) = w^. for each k. But ^(1;^) = w^ for each fe. Accordingly, by Theorem 6.2,
F = G; hence {Fy} generates Hom (V, U).
Proof that {Fy} is linearly independent. Suppose, for scalars ay G K,
2 2 »«^« 
i=l 3 = 1
For i;^, fc = 1, . . .,w,
= 0(v^) = 22 «ii^ij(^ic) = 2 a^jF^j(v^) = 2 aicjMi
i = l j = l 3 — 1 3 — i
= afcl^l + ak2M2 + • • • + fflfen^n
But the Mj are linearly independent; hence for k = 1, . . .,m, we have a^i — 0, 0^2 = 0, . . . , ajj„ = 0.
In other words, all the ay = and so {Fy} is linearly independent.
Thus {Fy} is a basis of Hom (V, 17); hence dim Hom {V, U) = mn.
6.33. Prove Theorem 6.8: Let V, U and W be vector spaces over K. Let F, F' be linear
mappings from V into f7 and let G, G' be linear mappings from U into W; and let
k&K. Then: (i) Go(F + F')  Goi?' + Goii''; (ii) {G + G')oF = GoF + G'oF;
(iii) fcCGoF) = {kG)oF = Go(fcF).
(i) For every v GV,
142 LINEAR MAPPINGS [CHAP. 6
(Go(F + F'mv) = G{(F + F'){v)) = G{F(v) + F'(v))
= G(F{v)) + G{F'(v)) = {G'>F)(v) + {GoF')(v) = {G ° F + G o F'){v)
Since {G ° (F + F'){v) = (G o F + G ° F'){v) for every vGV, Go {F + F') = G°F + G°F'.
(ii) For every v &V,
{(G + G')°F)(v) = {G + G')(F{v)) = G{F{v)) + G'{F(v))
= (Go F)(v) + {G' °F){v) = (G ° F + G' o F)(v)
Since ({G + G') ° F}(v) = {G ° F + G ° F')(v) for every v&V, (G + G')° F = G°F + G' °F.
(iii) For every v GV,
(k{G°F))(v) = k(G°F){v) = k{G{F(v))) = {kG)(F{v)) = (feG°F)(i;)
and {k{G°Fmv) = k(GoF){v) = k(G{F(v))) = G{kF{v)) = G{(kF){v)) = {G°kF){v)
Accordingly, k{G°F) = (kG)oF = G°(kF). (We emphasize that two mappings are shown to
be equal by showing that they assign the same image to each point in the domain.)
6.34. Let F:V^V and G.U^W be linear. Hence {GoF):V^W is linear. Show that
(i) rank {GoF) ^ rank G, (ii) rank (GoF) ^ rank F.
(i) Since F{V) c U, we also have G(F{V)) c G(U) and so dim G(F{V)) ^ dim G(V). Then
rank (GoF) = dim ((GoF)(y)) = dim (G(F(y))) ^ dim G(?7) = rank G
(ii) By Theorem 6.4, dim (G(F(y))) ^ dim F(y). Hence
rank (GoF) = dim ((Go F)(y)) = dim (G(F(y))) =£ dim F(y) = rank F
ALGEBRA OF LINEAR OPERATORS
6.35. Let S and T be the linear operators on R^ defined by S{x, y) = {y, x) and T{x, y) =
(0, x). Find formulas defining the operators S + T,2S ZT, ST, TS, S^ and T^.
{S+T){x,y) = S(x,y) + T(x,y) = {y,x) + (0,x) = {y,2x).
(2SZT)(x,y) = 2S(x,y)3T{x,y) = 2{y,x)  Z((i,x) = (2y,x).
(ST)(x,y) = S{.T(x,y)) = S(f),x)  (a;,0).
(TS)(x,y) = T(S(x,y)) = T(y,x) = {0,y).
SHx,y) = S{S{x,y)) = S{y,x) = (x,y). Note S^ = I, the identity mapping.
THx, y) = T(T(x, y)) = 7(0, x)  (0, 0). Note T^ = 0, the zero mapping.
6.36. Let T be the linear operator on R^ defined by
r(3, 1) = (2, 4) and T{1, 1) = (0, 2) (i)
(By Theorem 6.2, such a linear operator exists and is unique.) Find T{a, b). In
particular, find r(7, 4).
First write (a, 6) as a linear combination of (3,1) and (1, 1) using unknown scalars x and y:
(a, h) = a;(3, 1) + y(l, 1) (,2)
'Zx + y = a
Hence (a, b) = {Sx, x) + (y, y) = {Sx + y, x + y) and so
[^ X + y — b
Solving for x and y in terms of a and b,
X = ^o — ^6 and y = a + f 6 (5)
Now using (2), {1) and (3),
T(a, b) = xT{3, 1) + yT(l, 1) = oo(2, 4) + 2/(0, 2)
= {2x, 4x) + (0, 2y) = (2a;, 4a; + 2y) = (ab,5b 3a)
Thus m, 4) = (7  4, 20  21) = (3, 1).
CHAP. 6] LINEAR MAPPINGS 143
6.37. Let T be the operator on R^ defined by T{x, y, z) = {2x, 4a; — y,2x + 3yz). (i) Show
that T is invertible. (ii) Find a formula for T~^.
(i) The kernel W of T is the set of all (x, y, z) such that T{x, y, z) = (0, 0, 0), i.e.,
T(», y, z) = (2a;, 4x y,2x + Syz) = (0, 0, 0)
Thus W is the solution space of the homogeneous system
2a; = 0, 4x  y = 0, 2x + Sy  z =
which has only the trivial solution (0, 0, 0). Thus W — {0}; hence T is nonsingular and so by
Theorem 6.9 is invertible.
(ii) Let (r, s, t) be the image of (x, y, z) under T; then (a;, y, z) is the image of (r, s, t) under T^k
T(x, y, z) = (r, s, f) and T^r, s, t) = (x, y, z). We will find the values of x, y and z in terms
of r, s and t, and then substitute in the above formula for T~^. From
T{x, y, z) = (2a;, 4a; y,2x + 3yz) = {r, s, t)
we find X = ^r, y = 2r — s, z = 7r — Ss — t. Thus T~^ is given by
ri(r, s, t) = (^r,2rs,lr3st)
6.38. Let V be of finite dimension and let T be a linear operator on V. Recall that T is
invertible if and only if T is nonsingular or onetoone. Show that T is invertible if
and only if T is onto.
By Theorem 6.4, dim V = dim (Im T) + dim (Ker T). Hence the following statements are
equivalent: (i) T is onto, (ii) Im T = V, (iii) dim (Im r) = dimV, (iv) dim (Ker T) = 0,
(v) Ker T = {0}, (vi) T is nonsingular, (vii) T is invertible.
6.39. Let V be of finite dimension and let T be a linear operator on V for which TS = I,
for some operator S on V. (We call S a right inverse of T.) (i) Show that T is
invertible. (ii) Show that S = T~^. (iii) Give an example showing that the above
need not hold if V is of infinite dimension.
(i) Let dim V = n. By the preceding problem, T is invertible if and only if T is onto; hence T
is invertible if and only if rank T = n. We have n = rank / = rank TS — rank T — n.
Hence rank T = n and T is invertible.
(ii) rri = rir = /. Then s = /s = (rir)s = ri(rs) = r»/= ri.
(iii) Let V be the space of polynomials in t over K; say, p(t) = ao + Oji + Ojf^ + • • ■ + a„t". Let T
and S be the operators on V defined by
T(p{t)) = + ai + a2t+ ■■■ + a„«""i and S{p(t)) = a^t + a^t^ + • • ■ + a„t«+i
We have (rS)(p(«)) = T(S{p{t))) = r(aot + Oit2 + • • • + a„<»+i)
= Oo + ajt + • • • + o„<« = p{t)
and so TS = I, the identity mapping. On the other hand, ii k G K and fc # 0, then (,ST){k) =
S(T(k)) = S(0) = 0¥'k. Accordingly, ST ¥= I.
6.40. Let S and T be the linear operators on R^ defined by S{x, y) = (0, x) and T{x, y) 
{x, 0). Show that TS = but ST # 0. Also show that T^ = T.
(TS){x,y) = T(S{x,y)) = T{0,x) = {0,0). Since TS assigns = (0,0) to every (a;,j/)GR2, it
is the zero mapping: TS = 0.
(ST){x,y)=S(T{x,y)) = S(x,0) = (0,x). For example, (Sr)(4, 2) = (0, 4). Thus ST ¥ 0, since
it does not assign = (0, 0) to every element of R^.
For any {x,y) € R2, T^x.y) = T(T(x,y)) = T{x,0) = {x,Q) = T{x,y). Hence T^ = T.
144 LINEAR MAPPINGS [CHAP. 6
MISCELLANEOUS PROBLEMS
6.41. Let {ei, ei, 63} be a basis of Y and {/i, /z) a basis of TJ. Let T.V^U be linear.
Furthermore, suppose
T{ei) = aifi + 02/2
2^(62) = &i/i + b2/2 and A = ('''/ ''^
T(e3)  Ci/i + C2/2 ^
Show that, for any v GV, A[v]e = [T{v)]f where the vectors in K^ and K^ are written
as column vectors.
h\
Suppose V = fejei + fc2e2 + ^363; then [f]e = ^2 ■ Also,
T(v) = kiT{ei) + k^T^e^) + ksTie^)
 kiiaJi + 0^2) + kiibJi + 62/2) + hi'ifi + "2/2)
= (Olfcl + feifeg + Cifcg)/! + {a2ki + 62^2 + C2fc3)/2
Accordmgly, [Tiv)], = { ^,\ + ,[,1 + :^,l
Computing, we obtain
^Me =(„_._ )lfc2)  [a,k, + b,k2+c,kj  f^(^)l'
6.42. Let A; be a nonzero scalar. Show that a linear map T is singular if and only if kT
is singular. Hence T is singular if and only if —T is singular.
Suppose T is singular. Then T{v) = for some vector v ¥= 0. Hence ikT){v) = kT(v) = &0 =
and so kT is singular.
Now suppose kT is singular. Then {kT^(w) — for some vector w # 0; hence ^(fcw) =
kT(w) = (kT)(w) = 0. But fc # and w #^ implies kw ¥= 0; thus T is also singular.
6.43. Let £• be a linear operator on V for which E^ = E. (Such an operator is termed a
projection.) Let C/ be the image of E and W the kernel. Show that: (i) if m G C/,
then £'(m)  u, i.e. £7 is the identity map on U; (ii) if E ^I, then ^ is singular, i.e.
E{v) = for some v^O; (iii) V = U®W.
(i) If u&TJ, the image of S, then E{v) = u for some v GV. Hence using E^ — E, we have
u ^ E(v) = EHv) = E(E{v)) = E{u)
(ii) U E ¥= I then, for some v £ F, Bl'y) = u where v v^ m. By (i), E(u)  u. Thus
E'(v — m) = S(v) — S(m) = m — m = where v — u¥=0
(iii) We first show that V  U + W. Let v e y. Set m = E(v) and w = v E{v). Then
■U = £(l>) + I) — £'('U) = M + Of
By definition, m = E{v) S J7, the image of E. We now show that w e TF, the kernel of E:
E(w) = E{v  E(v)) = E(v)  E^v) = E(v)  E(v) 
and thus w G W. Hence V = U + W.
We next show that UnW  {0}. Let v eUnW. Since vGU, E(v) = v by (i). Since
v&W, E{v) = 0. Thus V = E(v) = and so UnW  {0}.
The above two properties imply that V = U ® W.
CHAP. 6]
LINEAR MAPPINGS
145
6.44. Show that a square matrix A is invertible if and only if it is nonsingular. (Compare
with Theorem 6.9, page 130.)
Recall that A is invertible if and only if A is row equivalent to the identity matrix 7. Thus the
following statements are equivalent: (i) A is invertible. (ii) A and 1 are row equivalent, (iii) The
equations AX = and IX = have the same solution space, (iv) AX = has only the zero solu
tion, (v) A is nonsingular.
Supplementary Problems
MAPPINGS
6.45. State whether each diagram defines a mapping from {1, 2, 3} into {4, 5, 6}.
6.46. Define each of the following mappings / : R > R by a formula:
(i) To each number let / assign its square plus 3.
(ii) To each number let / assign its cube plus twice the number.
(iii) To each number — 3 let / assign the number squared, and to each number < 3 let / assign
the number —2.
6.47. Let /:R^R be defined by f(x) = x^4x + 3. Find (i) /(4), (ii) /(3), (iii) /(j/  2a;), (iv)/(a!2).
6.48. Determine the number of different mappings from {o, 6} into {1, 2, 3}.
6.49. Let the mapping g assign to each name in the set {Betty, Martin, David, Alan, Rebecca} the number
of different letters needed to spell the name. Find (i) the graph of g, (ii) the image of g.
6.50. Sketch the graph of each mapping: (i) f(x) = ^x — 1, (ii) g(x) = 2x^ — 4x — 3.
6.51. The mappings f:A^B, g:B^A, h:C*B, F.B^C and GiA^C are illustrated in the
diagram below.
Determine whether each of the following defines a composition mapping and, if it does, find its
domain and codomain: {\)g°f, {n)h°f, (iii) Fo/, (iv)G°f, {y)g°h, (vi) h°G°g.
6.52. Let /:R^R and fir : R ^ R be defined by f(x) = x^ + Sx + l and g(x) = 2x3. Find formulas
defining the composition mappings (i) f°g, (ii) g°f, (iii) g°g, (iv) f°f.
6.53. For any mapping f:A>B, show that 1b° f — f — f°'^A
146 LINEAR MAPPINGS [CHAP. 6
6.54. For each of the following mappings / : R > R find a formula for the inverse mapping: (i) f{x) =
Sx  7, (ii) fix) = x» + 2.
LINEAR MAPPINGS
6.55. Show that the following mappings F are linear:
(i) F : R2 ^ R2 defined by F(x, y) = {2x  y, x).
(ii) F : R3 » R2 defined by F{x, y, z) = {z,x + y).
(iii) jF : R > R2 defined by F(x) = (2.x, Zx).
(iv) F : R2 ^ R2 defined by F(x, y) = [ax + hy, ex + dy) where a, 6, c, d e R.
6.56. Show that the following mappings F are not linear:
(i) F : R2 ^ R2 defined by F(x, y) = (x^, y^).
(ii) F : R3 ^ R2 defined by Fix, y,z) = ix + l,y + z).
(iii) F : R ^ R2 defined by Fix) = ix, 1).
(iv) F:R2>R defined by Fix,y) = \xy\.
6.57. Let V be the vector space of polynomials in t over K. Show that the mappings T :V *V and
S :V > V defined below are linear:
Tiaa + ait + • • • + a^t") = a^t + a^t^ + ■ • • + a„t" + i
S(ao + ai« + • • • + a„t") = Q + ax + a^t + ■ ■ ■ + aj"^
6.58. Let V be the vector space ot nXn matrices over K; and let M be an arbitrary matrix in V. Show
that the first two mappings T .V ^V are linear, but the third is not linear (unless M = 0):
(i) r(A) = MA, (ii) TiA)  MA AM, (iii) TiA) =^ M + A.
6.59. Find Tia, b) where T : R2 ^ R3 is defined by r(l, 2) = (3, 1, 5) and r(0, 1) = (2, 1, 1).
6.60. Find Tia, b, c) where T : RS ^ R is defined by
Til, 1, 1) = 3, r(0, 1, 2) = 1 and ^(O, 0, 1) = 2
6.6L Suppose F:V *U is linear. Show that, for any vGV, Fiv) = Fiv).
6.62. Let W be a subspace of V. Show that the inclusion map of W into V, denoted by i:W cV and
defined by t(w) = w, is linear.
KERNEL AND IMAGE OF LINEAR MAPPINGS
6.63. For each of the following linear mappings F, find a basis and the dimension of (a) its image U
and (6) its kernel W:
(i) F : R3 > R8 defined by F(x, y, z) = ix + 2y,yz,x + 2z).
(ii) F : R2 ^ R2 defined by Fix,y) = ix + y,x + y).
(iii) F : R3 ^ R2 defined by Fix, y,z)  ix + y,y + z).
6.64. Let V be the vector space of 2 X 2 matrices over R and let M = f j . Let F : V * V be the
linear map defined by FiA) = MA. Find a basis and the dimension of (i) the kernel TF of F and
(ii) the image U of F.
6.65. Find a linear mapping F : R3 ^ RS whose image is generated by (1, 2, 3) and (4, 5, 6).
6.66. Find a linear mapping F : R* ^ RS whose kernel is generated by (1, 2, 3, 4) and (0, 1, 1, 1).
6.67. Let V be the vector space of polynomials in t over R. Let D:V *V be the differential operator:
Dif) = df/dt. Find the kernel and image of D.
6.68. Let F:V^U be linear. Show that (i) the image of any subspace of y is a subspace of U and
(ii) the preimage of any subspace of U is a subspace of V.
CHAP. 6] LINEAR MAPPINGS 147
6.69. Each of the following matrices determines a linear map from K,* into R^:
'12 1^
(i) A = ( 2 1 2 1 I (ii) B =
^1 3 2 2/
Find a basis and the dimension of the image U and the kernel W of each map.
6.70. Let r : C > C be the conjugate mapping on the complex field C. That is, T(z) = z where z G C,
or T(a + bi) = a— bi where a, 6 e R. (i) Show that T is not linear if C is viewed as a vector
space over itself, (ii) Show that T is linear if C is viewed as a vector space over the real field R.
OPERATIONS WITH LINEAR MAPPINGS
6.71. Let iJ' : R3 » R2 and G : R^ ^ R2 be defined by F{x, y, z) = (y,x + z) and G(x, y, z) = (2«, x  y).
Find formulas defining the mappings F + G and SF — 20.
6.72. Let H : R2 ♦ R2 be defined by H(x, y) — (y, 2x). Using the mappings F and G in the preceding
problem, find formulas defining the mappings: (i) H°F and H °G, (ii) F°H and G°H,
(in) Ho(F + G) and HoF + H°G.
6.73. Show that the following mappings F, G and H are linearly independent:
(i) F,G,He Horn (R2, R2) defined by
Fix, y) = {x, 2y), G{x, y) = {y,x + y), H{x, y) = (0, x).
(ii) F,G,He Hom (R3, R) defined by
F{x, y, z) = x + y + z, G(x, y,z) — y + z, H(x, y, z) = x — z.
6.74. For F,G & Rom {V, U), show that rank (F + G) ^ rank i^ + rank G. (Here V has finite
dimension.)
6.75. Let F :V ^ U and G:U*V be linear. Show that if F and G are nonsingular then G°F is
nonsingular. Give an example where G°F is nonsingular but G is not.
6.76. Prove that Hom (V, U) does satisfy all the required axioms of a vector space. That is, prove
Theorem 6.6, page 128.
ALGEBRA OP LINEAR OPERATORS
6.77. Let S and T be the linear operators on R2 defined by S{x, y) — {x + y, 0) and T{x, y) = (—y, x).
Find formulas defining the operators S + T, 5S  ST, ST, TS, S^ and T^.
6.78. Let T be the linear operator on R2 defined by T{x, y) — {x + 2y, 3x + Ay). Find p(T) where
p{t) = t2 _ 5f _ 2.
6.79. Show that each of the following operators T on R^ is invertible, and find a formula for T~h
(i) T{x, y, z) = (x3y 2z, y  4«, z), (ii) T{x, y,z) = {x + z,x z, y).
6.80. Suppose S and T are linear operators on V and that S is nonsingular. Assume V has finite dimen
sion. Show that rank (ST) = rank (TS) = rank T.
6.81. Suppose V = U ® W. Let Ei and E2 be the linear operators on V defined by Ei(v) = u,
E2(v) = w, where v = u + w, ue.U,w&W. Show that: (i) Bj = E^ and eI = E2, i.e. that Ei
and £?2 are "projections"; (ii) Ei + E2 — I, the identity mapping; (iii) E1E2 = and E2E1 = 0.
6.82. Let El and E2 be linear operators on V satisfying (i), (ii) and (iii) of Problem 6.81. Show that V
is the direct sum of the image of E^ and the image of £2 ^ — Im £?i © Im £2
6.83. Show that if the linear operators S and T are invertible, then ST is invertible and (ST)^ = T^S^.
148 LINEAR MAPPINGS [CHAP. 6
6.84. Let V have finite dimension, and let T be a linear operator on V such that rank (T^) = rank T.
Show that Ker TnlmT = {0}.
MISCELLANEOUS PROBLEMS
6.85. Suppose T .K^^ X"» is a linear mapping. Let {e^, . . . , e„} be the usual basis of Z" and let A be
the mXn matrix whose columns are the vectors r(ei), . . ., r(e„) respectively. Show that, for every
vector V G R", T(v) = Av, where v is written as a column vector.
6.86. Suppose F .V * U is linear and fc is a nonzero scalar. Show that the maps F and kF have the
same kernel and the same image.
6.87. Show that if F:V ^ U is onto, then dim U  dim V. Determine all linear maps T:W*R*
which are onto.
6.88. Find those theorems of Chapter 3 which prove that the space of wsquare matrices over K is an
associative algebra over K.
6.89. Let T :V ^ U be linear and let W he a subspace of V. The restriction of T to W is the map
Tt^.W^U defined by r^(w) = T{w), for every wGW. Prove the following, (i) T^^ is linear.
(ii) Ker T^ = Ker T n W. (iii) Im T^r = T(W).
6.90. Two operators S, T G A(V) are said to be similar if there exists an invertible operator P G A{V)
for which S = P^TP. Prove the following, (i) Similarity of operators is an equivalence relation.
(ii) Similar operators have the same rank (when V has finite dimension).
Answers to Supplementary Problems
6.45. (i) No, (ii) Yes, (iii) No.
6.46. (i) fix) = x2 + 3, (ii) /(») = a;3 + 2a;, (iii) fix) = {"^ if »  3
[2 if a; < 3
6.47. (i) 3, (ii) 24, (iii) j/2  4xy + 4x2 ^^y + ^x + S, (iv) a;^  8a! + 15.
6.48. Nine.
6.49. (i) {(Betty, 4), (Martin, 6), (David, 4), (Alan, 3), (Rebecca, 5)}.
(ii) Image of g  {3, 4, 5, 6}.
6.51. (i) {g o f) : A * A, (ii) No, (iii) (F o /) : A » C, (iv) No, (v) (goh) :C ^ A, (yi) {h°G°g) iB ^ B.
6.52. (i) (/ ° g){x) = 4a;2  6a; + 1 (iii) (g o g)(x) =Ax9
(ii) (f^°/)(a;) = 2a;2 + 6a;l (iv) (/ o /)(a;) = a;* + Ga;* + 14a;2 + 15x + 5
6.54. (i) fHx) = (x + 7)/3, (ii) /!(«) = V^^^^.
6.59. T{a, b) = {a + 26, 3a + 6, 7a  6).
6.60. T{a, b, c) = 8o  36  2c.
6.61. F(v) + F{v) = F(v + (■ u)) = F(0) = 0; hence F(v)  F{v).
6.63. (i) (a) {(1, 0, 1), (0, 1, 2)}, dim U = 2; (6) {(2, 1, 1)}, dim W = \.
(ii) (a) {(1, 1)}, dim ?7 = 1; (6) {(1, 1)}, dim T^ = 1.
(iii) (a) {(1, 0), (0, 1)}, dim t/ = 2; (6) {(1, 1, 1)}, dim W = \.
CHAP. 6] LINEAR MAPPINGS 149
6.64. (i) U^ ")' (o i) I'asisof KerF; dim(KerF) = 2.
(") {(_2 l)' (I _2) ^^sisof ImF; dim(ImF) = 2.
6.65. F(x, y, z) = {x + 4y, 2x + 5y, Sx + 6y).
6.66. F(x, y,z,w) = (x + y  z, 2x + y  w, 0).
6.67. The kernel of D is the set of constant polynomials. The image of D is the entire space V.
6.69. (i) (a) {(1,2,1), (0,1,1)} basis of Im A; dim(ImA) = 2.
(6) {(4, 2, 5, 0), (1, 3, 0, 5)} basis of KerA; dim(KerA) = 2.
(ii) (a) ImB = R3; (6) {(1,2/3,1,1)} basis of KerB; dim(KerB) = l.
6.71. (F + G)(,x, y, z) = (y + 2z, 2xy + z), (3F  2G)(x, y, z) = (3j/ Az,x + 2y + Zz).
6.72. (i) (H°F){x,y,z) = {x + z,2y), (H°G)(x,y,z)  {xy.iz). (il) Not defined.
(iii) (Ho(F + G)){x, y,z) = {HoF + Ho G)(x, y, z) = {2xy + z, 2y + 4«).
6.77. (S + T)(x, y) = (x, x) (ST){x, y) =z {x y, 0)
(5S  3r)(a;, y) = (5a; + 8y, 3x) (TS){x, y) = (0, a; + y)
SHx, v) = (x + y, 0); note that S^ = S.
Ti(x^ y) = {X, y); note that T^\I = Q, hence T is a zero of x'^ + 1.
6.78. v{T) = 0.
6.79. (i) THr, s, t) = (14* + 3s + r, 4t + s, t), (ii) T^r, s, t) = (^r + ^s, t, ^r  s).
6.87. There are no linear maps from RS into R* which are onto.
chapter 7
Matrices and Linear Operators
INTRODUCTION
Suppose {ei, . . . , e„} is a basis of a vector space V over a field K and, for v GV, suppose
V = ttiei + 0.262 + • • • + Omen. Then the coordinate vector of v relative to {ei}, which we write
as a column vector unless otherwise specified or implied, is
\an
Recall that the mapping v l^ [v]e, determined by the basis {Ci}, is an isomorphism from V
onto the space K".
In this chapter we show that there is also an isomorphism, determined by the basis
{ei}, from the algebra A{V) of linear operators on V onto the algebra cA of nsquare matrices
over K.
A similar result also holds for linear mappings F:V^U, from one space into another.
MATRIX REPRESENTATION OF A LINEAR OPERATOR
Let r be a linear operator on a vector space V over a field K and suppose (ei, . . . , en} is
a basis of V. Now T{ei), . . . , r(e„) are vectors in V and so each is a linear combination of
the elements of the basis {e,}:
T(ei) = anCi + 01262 + • • • + oi^en
r(e2) = 02161 + 02262 + • • • + a2n6n
T{en) = Oniei + an2e2 + • • • + o„„e„
The following definition applies.
Definition: The transpose of the above matrix of coefficients, denoted by [T]e or [T], is
called the matrix representation of T relative to the basis {ei} or simply the
matrix of T in the basis {et}:
(On 021 ... Onl '
012 022 . . . fln2
Om a2n ... 0,
Example 7.1 : Let V be the vector space of polynomials in t over R of degree ^ 3, and let D : V * V
be the differential operator defined by D{p(t)) = d{p(t))/dt. We compute the matrix
of D in the basis {1, t, t^, fi}. We have:
D(l) = = + Of + 0*2 + 0*3
D(t) = 1 = 1 + Ot + 0(2 + 0*3
D(fi) = It = + 2t + 0f2 + 0«3
D{fi) = 3t2 = + Ot + 3t2 + 0t3
150
CHAP. 7]
MATRICES AND LINEAR OPERATORS
151
Accordingly,
Example 7.2:
[D] =
Let T be the linear operator on R2 defined by T(x, y) — (ix — 2y, 2x + y
pute the matrix of T in the basis {/i = (1, 1), /a = (1, 0)}. We have
We com
T(Ji) = r(l, 1) = (2,3) = 3(1, 1) + (1, 0) = 3/1 + /2
Tifz) = n1, 0) = (4, 2) = 2(1, 1) + 2(l, 0) = 2/1 + 2/2
(3
3 2
2
Remark: Recall that any nsquare matrix A over K defines a linear operator on K" by
the map v t^ Av (where v is written as a column vector). We show (Problem
7.7) that the matrix representation of this operator is precisely the matrix A
if we use the usual basis of K".
Our first theorem tells us that the "action" of an operator T on a vector v is preserved
by its matrix representation:
Theorem 7.1: Let (ei, . . ., e„} be a basis of V and let T be any operator on V. Then, for
any vector vGV, [T]e [v]e = [Tiv)]e.
That is, if we multiply the coordinate vector of v by the matrix representation of T,
then we obtain the coordinate vector of T{v).
Example 7.3: Consider the differential operator D:V ^V in Example 7.1. Let
p{t) = a+bt + cfi + dt^ and so D{p{t)) = b + 2ct + 3dt^
Hence, relative to the basis {1, t, t^, t^,
[p(t)] =
and [D(p{t))] =
We show that Theorem 7.1 does hold here:
'0 1 o\
2
3,,
iO o/\
[D][Pit)] =
= [D(pm
Example 7.4: Consider the linear operator T : R2 ^ R2 in Example 7.2: T{x, y) = (4a; — 2y, 2x + j
Let V = (5, 7). Then
V = (5,7) = 7(1, 1) + 2(l, 0) = 7/1 + 2/2
T{v) = (6, 17) = 17(1, 1) + 11(1, 0) = 17/i + 11/2
where /j = (1, 1) and fz = (1, 0). Hence, relative to the basis {/i, /a),
and [T(v)]f =
11
Using the matrix [T]f in Example 7.2, we verify that Theorem 7.1 holds here:
<2y
m,M, ^ (r^G) = (ID = i^<*
152 MATRICES AND LINEAR OPERATORS [CHAP. 7
Now we have associated a matrix [T]e to each T in A{V), the algebra of linear operators
on V. By our first theorem the action of an individual operator T is preserved by this
representation. The next two theorems tell us that the three basic operations with these
operators
(i) addition, (ii) scalar multiplication, (iii) composition
are also preserved.
Theorem 7.2: Let {ei, ...,e„} be a basis of V over K, and let oA be the algebra of
«square matrices over K. Then the mapping T h* [T]e is a vector space
isomorphism from A{V) onto cA. That is, the mapping is oneone and onto
and, for any S,T G A{V) and any keK,
[T + S]e = [T]e+[S]e and [kT]e = k[T]e
Theorem 7.3: For any operators S,T G A{V), [ST]e = [S]e [T]e.
We illustrate the above theorems in the case dim V = 2. Suppose {ei, ez} is a basis of
V, and T and S are operators on V for which
T{ei) = aiei + a^ez S{ei) — CiCi + 0262
7(62) = biei + 6262 ' 8(62) = diCi + did
[^iCy  i^i = (:;*;
Now we have {T + S){ei)  T{ei) + S(ei)  aiCi + 0262 + ciCi + 6262
= (tti + Ci)ei + (a2 + 62)62
(T + S){e2) = r(e2) + 5(62) = bid + 6262 + did + d^ez
= (&i + di)ei + (62 + ^2)62
'^^"® 'tti + ci &i + di\ /«! &i\ , /ci di
^ J l^ttz + ca bz + dzj [az bzj \cz dzj
Also, for k EK, we have
(A;r)(ei) = fcr(ei) = k{aiei + azBz) = kaiCi + ka^ez
{kT){ez) = kTiez) = k(biei + bzez) = kbiei + kbzez
fkai kbi\ , /ai bi\ , .m.
Finally, we have
(Sr)(ei) == S(r(ei)) = S(aiei + a2e2) = aiS(ei) + a2S(e2)
= ai{ciei + CzCz) + azidiBi + dzCz)
= (ttiCi + a2di)ei + (aiCz + azdz)ez
(Sr)(e2) = S{T{ez)) = S(biei + bzCz) = biS{ei) + b^Siez)
= bi{ciei + CzCz) + bzidiBi + dzBz)
= {biCi + bzdi)ei + {biCz + bzdz)ez
Accordingly,
_ /aiCi + azdi biCi + bzdi\ _ /ci dA /ai bi\ _ , ._
^ J' ~ [aicz + azdz biCz + bzdz) ~ [cz dz) \az bzj L>IJ«
CHAP. 7] MATRICES AND LINEAR OPERATORS 153
CHANGE OF BASIS
We have shown that we can represent vectors by ntuples (column vectors) and linear
operators by matrices once we have selected a basis. We ask the following natural question:
How does our representation change if we select another basis? In order to answer this
question, we first need a definition.
Definition: Let [ei, . . .,e„} be a basis of V and let {/i, ...,/«} be another basis. Suppose
/i = anei + ai2C2 + • • • + ai„e„
/2 = aziBi + 022^2 + • • • + a2T.e„
fn = ttnlCi + a„2e2 + • • • + UnnCn
Then the transpose P of the above matrix of coeflScients is termed the transi
tion matrix from the "old" basis {d} to the "new" basis {/{}:
fflll ft21 . . . ftnl '
p _ I *12 ^'■22 . . . ffin2
We comment that since the vectors fi, . . .,fn are linearly independent, the matrix P is
invertible (Problem 5.47). In fact, its inverse P^Ms the transition matrix from the basis
{/,} back to the basis {Ci}.
Example 7.5: Consider the following two bases of R^:
{ei = (1, 0), 62 = (0, 1)} and ih = (1, D, A = (1. 0)}
Then A = (1,1) = (1,0) + (0,1) = e^ + e^
/2 = (1,0) = (1,0) + 0(0,1) = ei + 0e2
Hence the transition matrix P from the basis {ej} to the basis {/J is
'1 V
We also have e^ = (1, 0) = 0(1, 1)  (1, 0) = O/i  /g
62 = (0,1) = (1,1) + (1,0) = /1+/2
Hence the transition matrix Q from the basis {/J back to the basis {e^} is
IN
Q = ,
Observe that P and Q are inverses:
We now show how coordinate vectors are affected by a change of basis.
Theorem 7.4: Let P be the transition matrix from a basis {Ci} to a basis {fi} in a vector
space V. Then, for any vector v G V, P[v]f = [v]e. Hence [v]f = P~'^[v]e.
We emphasize that even though P is called the transition matrix from the old basis
{Ci} to the new basis {/i}, its effect is to transform the coordinates of a vector in the new
basis {fi} back to the coordinates in the old basis {ei}.
154 MATRICES AND LINEAR OPERATORS [CHAP. 7
We illustrate the above theorem in the case dim F = 3. Suppose P is the transition
matrix from a basis {61,62,63} of F to a basis {fufzifa} of V; say,
A = ttiCi + 0262 + 0363 1 0.1 bi ci\
/2 = biBi + 6262 + 6363 . Hence P = 02 &2 C2
fa = C161 + 6262 + 6363 \aa ba Caj
Now suppose V G F and, say, v = fei/i + /i;2/2 + fcs/s. Then, substituting for the /i from
above, we obtain
V = /i;i(aiei + a262 + a363) + fc2(biei + 6262 + 6363) + kaiciei + 0262 + 6363)
= (aifci + bife + Cika)ei. + (azki + bzki + C2ka)e2 + {aaki + bakz + 63^3)63
Thus /jcA jaiki + bife + Cifc3^
[v], = Ikz] and [v]e  a^ki + 62^2 + dka
\lc3l \a3ki + b3k2 + cakal
Accordingly, / ^ ^^ ^,^\ jaikr + bik2 + cM
P[v]f  I a2 &2 62^2 = ttzifci + 62^2 + 62^3 = [V]e
\a3 ba Caj\kaj \a3k1 + bakz + Cakaj
Also, multiplying the above equation by P~S we have
P'[v]e = P'P[v]f = I[V], = [V],
Example 7.6: Let v  (a, b) e R2. Then, for the bases of R* in the preceding example,
V = (a, b) = a(l, 0) + 6(0, 1) = ae^ + be^
V ^ (a, 6) = b{l,l) + (ba)(l,0) = bfi + iba)/^
Hence [v]^ = (J^j and Mf = (ft _ „)
By the preceding example, the transition matrix P from {ej to {/J and its inverse
pi are given by
I'D  '■ = (: :
We verify the result of Theorem 7.4:
M. = (_: DC) = (.!.) = ..
The next theorem shows how matrix representations of linear operators are affected
by a change of basis.
Theorem 7.5: Let P be the transition matrix from a basis {Ci} to a basis {/i} in a vector
space V. Then for any linear operator T on F, [T]t = Pi[T]eP.
Example 7.7: Let T be the linear operator on E^ defined by T(x, y) = (4a;  2j/, 2a; + j/). Then for
the bases of R^ in Example 7.5, we have
r(ei) = r(l, 0) := (4, 2) = 4(1, 0) + 2(0, 1) = 4ei + 2e2
ne^) = r(0,l) = (2,1) = 2(1,0) + (0,1) = 2ei + e2
Accordingly, [T]e
/4 2
V2 1
CHAP. 7] MATRICES AND LINEAR OPERATORS 155
We compute [T]f using Theorem 7.5:
m,  pi... = (_: ixi ixi I)  (i I
Note that this agrees with the derivation of [T]f in Example 7.2.
Remark: Suppose P  (Oij) is any «square invertible matrix over a field K. Now if
{ei, .... en} is a basis of a vector space V over K, then the n vectors
/i = aiiCi + 02162 + • • • + a„ie„, i=l, . . .,n
are linearly independent (Problem 5.47) and so form another basis of V.
Furthermore, P is the transition matrix from the basis {«{} to the basis {/{}.
Accordingly, if A is any matrix representation of a linear operator T on V,
then the matrix B = P~^AP is also a matrix representation of T.
SIMILARITY
Suppose A and B are square matrices for vs^hich there exists an invertible matrix P
such that B = P~^AP. Then B is said to be similar to A or is said to be obtained from A
by a similarity transformation. We show (Problem 7.22) that similarity of matrices is an
equivalence relation. Thus by Theorem 7.5 and the above remark, we have the following
basic result.
Theorem 7.6: Two matrices A and B represent the same linear operator T if and only if
they are similar to each other.
That is, all the matrix representations of the linear operator T form an equivalence
class of similar matrices.
A linear operator T is said to be diagonalizable if for some basis (Ci} it is represented
by a diagonal matrix; the basis {«{} is then said to diagonalize T. The preceding theorem
gives us the following result.
Theorem 7.7: Let A be a matrix representation of a linear operator T. Then T is
diagonalizable if and only if there exists an invertible matrix P such that
P~^AP is a diagonal matrix.
That is, T is diagonalizable if and only if its matrix representation can be diagonalized
by a similarity transformation.
We emphasize that not every operator is diagonalizable. However, we will show
(Chapter 10) that every operator T can be represented by certain "standard" matrices
called its normal or canonical forms. We comment now that that discussion will require
some theory of fields, polynomials and determinants.
Now suppose / is a function on square matrices which assigns the same value to similar
matrices; that is, f{A) = f{B) whenever A is similar to B. Then / induces a function, also
denoted by /, on linear operators T in the following natural way: f{T) = f{[T]e), where {d}
is any basis. The function is welldefined by the preceding theorem.
The determinant is perhaps the most important example of the above type of functions.
Another important example follows.
Example 7.8: The trace of a square matrix A = (oy), written tr (A), is defined to be the sum of
its diagonal elements:
tr (A) = an + 022 + • • • + a„„
We show (Problem 7.22) that similar matrices have the same trace. Thus we can
speak of the trace of a linear operator T; it is the trace of any one of its matrix
representations: tr {T) = tr ([T]g).
156 MATRICES AND LINEAR OPERATORS [CHAP. 7
MATRICES AND LINEAR MAPPINGS
We now consider the general case of linear mappings from one space into another.
Let V and U be vector spaces over the same field K and, say, dim V = m and dim U = n.
Furthermore, let {ei, . . . , em} and {/i, ...,/«} be arbitrary but fixed bases of V and U
respectively.
Suppose F:V^U is a linear mapping. Then the vectors F{ei), . .., F{em) belong to
U and so each is a linear combination of the fc
F{ei) = aii/i + ai2/2 + • • • + ai„fn
F{e2) = ttzi/i + 022/2 + • • • + aznfn
F{em) — ttml/l + dmifl + ■ ' • + Ctmn/n
The transpose of the above matrix of coefficients, denoted by [F]l is called the matrix
representation of F relative to the bases {ei} and {ft}, or the matrix of F in the bases {ec}
and {/i}:
/ ftll ft21 . . . ami
rmf _ ^^12 CI22 . . • ttm2
L^Je 
\ din 0/2n . . • dmn
The following theorems apply.
Theorem 7.8: For any vector v GV, [F]l [v]e = [F{v)],.
That is, multiplying the coordinate vector of v in the basis (ei} by the matrix [F]l, we
obtain the coordinate vector of F{v) in the basis {fi).
Theorem 7.9: The mapping F ^ [F]f is an isomorphism from Hem (V, U) onto the vector
space of % X m matrices over K. That is, the mapping is oneone and onto
and, for any F, G G Horn {V, U) and any fc e K,
[F + G]f = [F]i + [G]/ and [kF]f = k[F]i
Remark: Recall that any nxm matrix A over K has been identified with the linear map
ping from K'" into Z" given by v M' Av. Now suppose V and U are vector
spaces over K of dimensions m and n respectively, and suppose {e;} is a basis
of V and {fi} is a basis of U. Then in view of the preceding theorem, we shall
also identify A with the linear mapping F:V^U given by [F{v)]f = A[v]e. We
comment that if other bases of V and U are given, then A is identified with
another linear mapping from V into U.
Theorem 7.10: Let {ei}, {fi} and {Qi} be bases of V, U and W respectively. Let F:V*U
and G:U*W be linear mappings. Then
[GoFYe = [GYfWVe
That is, relative to the appropriate bases, the matrix representation of the composition
of two linear mappings is equal to the product of the matrix representations of the
individual mappings.
We lastly show how the matrix representation of a linear mapping F:V*U is affected
when new bases are selected.
Theorem 7.11: Let P be the transition matrix from a basis {ei} to a basis (e,'} in V, and let
Qbe the transition matrix from a basis {/i} to a basis {//} in [/. Then for
any linear mapping F:V ^ U,
[Ft = Q'inp
CHAP. 7]
MATRICES AND LINEAR OPERATORS
157
Thus in particular,
i.e. when the change of basis only takes place in JJ; and
[F]l. = [F]iP
i.e. when the change of basis only takes place in V.
Note that Theorems 7.1, 7.2, 7.3 and 7.5 are special cases of Theorems 7.8, 7.9, 7.10
and 7.11 respectively.
The next theorem shows that every linear mapping from one space into another can be
represented by a very simple matrix.
Theorem 7.12: Let F:V*U be linear and, say, rankF = r. Then there exist bases of
V and of V such that the matrix representation of F has the form
A =
I
where / is the rsquare identity matrix. We call A the normal or canonical
form of F.
WARNING
As noted previously, some texts write the operator symbol T to the right of the vector
V on which it acts, that is,
vT instead of T{v)
In such texts, vectors and operators are represented by ntuples and matrices which are the
transposes of those appearing here. That is, if
then they write
felCl + feez + • • • + knCn
[v]e = (A;i, fe, . . ., kn) instead of [v]e =
And if
then they write
[T]e =
'tti Oi
bi b2
lCi C2
r(ei) = aiei + aid + • • • + a„en
T{e2) = 6iei + 6262 + • • • + &ne„
r(e„) = ciei + 0262 + • • • + c„e„
instead of [T]e =
This is also true for the transition matrix from one basis to another and for matrix rep
resentations of linear mappings F:V ^ U. We comment that such texts have theorems
which are analogous to the ones appearing here.
158 MATRICES AND LINEAR OPERATORS [CHAP. 7
Solved Problems
MATRIX REPRESENTATIONS OF LINEAR OPERATORS
7.1. Find the matrix representation of each of the following operators T on R^ relative to
the usual basis {ei = (1, 0), 62 = (0, 1)}:
(i) T{x, y) = {2y, Sx  y), (ii) T{x, y) = (3x 4y,x + 5y).
Note first that if (a, b) S R2, then (a, b) = ae^ + be^.
(i) r(ei) = r(l,0) = (0,3) = 0ei + 3e2 ^ .^, /O 2
and rri. = ( „
T{e^) = T{Q,1) = (2,1) = 2ei e^ \S 1
(ii) r(ei) = r(l,0) = (3,1) = 3ei+ 62 /3 4
and [rig = (
Tie^) = r(0,l) = (4,5) = 461 + 562 \1 5
7.2. Find the matrix representation of each operator T in the preceding problem relative
to the basis {A = (1,3), /a = (2, 5)}.
We must first find the coordinates of an arbitrary vector (a, b) G K^ with respect to the basis
{/J. We have
(a, b) = x(l, 3) + 2/(2, 5) = (x + 2y, Zx + 5y)
or X + 2y = a and Sx + 5y = b
or a; = 26 — 5a and y ~ 3a — 6
Thus (a, 6) = (26  5a)/i + (3a  6)/2
(i) We have T{x, y) = (2y, Sx  y). Hence
r(/i) = r(l,3) = (6,0) = 3O/1 + I8/2
r(/2) = r(2, 5) = (10, 1) = 48/i + 29/2
(ii) We have T(x, y) = (3x — 4y,x + 5y). Hence
r(/i) = r(l,3) = (9,16) = llfiASf^
T(h) = r(2,5) = (14,27) = 124/169/2
and [T]f =
and [T]f
30
48
18
29
77
124
43
69
ai
a2
as
61
&2
bs
Ci
C2
Cs
7.3. Suppose that T is the linear operator on R^ defined by
T{x, y, z) = (ttiic + a2.y + aaz, bix + h^y + bsz, cix + dy + Cs^)
Show that the matrix of T in the usual basis (ei} is given by
[T]e 
That is, the rows of [T]e are obtained from the coefficients of x, y and z in the com
ponents of T{x, y, z).
T{ei) = T{1, 0, 0) = (ai, 61, Ci) = a^ei + b^ez + c^e^
7(62) = T(0, 1, 0) = (02, 62, C2) = 0361 + 6262 + 6363
7(63) = r(0, 0, 1) = (aa, 63, C3) = agei + 6363 + 6363
Accordingly, /ai 03 aaX
lT]e = (h 62 63 1
Remark: This property holds for any space K'^ but only relative to the usual basis
{ei = (l, 0, ...,0), 62 = (0,1,0, ...,0), ..., e„ = (0, ...,0,1)}
CHAP. 7] MATRICES AND LINEAR OPERATORS 159
7.4. Find the matrix representation of each of the following linear operators T on R^
relative to the usual basis (ei = (1, 0, 0), 62 = (0, 1, 0), 63 = (0, 0, 1)}:
(i) T{x,y,z)  {2xZy + Az,5xy + 2z,Ax + ly),
(ii) T{x,y,z) = {2y + z,x4:y,Zx).
By Problem 7.3: (i) [T]^ =  5 1 2 ) , (ii) [T]^ = \ 1 4
7.5. Let T be the linear operator on R^ defined by T(x, y, z) = {2y + z,x — Ay, 3a;).
(i) Find the matrix of T in the basis {/i = (1, 1, 1), /a = (1, 1, 0), fa = (1, 0, 0)}
(ii) Verify that [T], [v]f = [T{v)]s for any vector v G R^.
We must first find the coordinates of an arbitrary vector (a, h, c) G R^ with respect to the basis
{fvfz'fai Write {a,b,c) as a linear combination of the /j using unknown scalars x, y and z:
(a, b, c) = x{l, 1, 1) + y{l, 1, 0) + z(l, 0, 0)
= (x + y + z,x + y,x)
Set corresponding components equal to each other to obtain the system of equations
X + y + z = a, X + y = b, x = c
Solve the system for x, y and z in terms of a, b and c to find x = c, y = b ~e, z = a — b. Thus
(a, 6, c) = c/i + (6  c)/2 + (a  6)/3
(i) Since T(x,y,z) = (2j/ + z, a;  4j/, 3a;)
r(/i) = r(l,l,l) = (3,3,3) = 3/16/2 + 6/3
r(/2) = r(l,l,0) = (2,3,3) = 3/16/2 + 5/3 and [T]f
T{fa) = r(l,0,0) = (0,1,3) = 3/12/2 /a
(ii) Suppose v ~ (a, b, c); then
V = (a,b,c) = c/i + (6 — c)/2 + (a — 6)/3 and so [v]
Also,
T{v) = T(a, b, c) = (26 + c, a  46, 3(i)
= 3a/i + (2a  46)/2 + (0 + 66 + c)/3
Thus
and so [T{v)]f
[T]f[v]f = 6 6 2 6c = 2a45 = lT(v)]f
7.6. Let A  ( j and let T be the linear operator on R^ defined by T{v) = Av (where
V is written as a column vector). Find the matrix of T in each of the following bases:
(i) {ei = (1, 0), 62 = (0, 1)}, i.e. the usual basis;
(ii) {/i = (l,3), /2 = (2,5)}.
(i) Tie,) =(l 2yi\ ^ (I) = u, + 3e,
\3 4/\0/ ^3/   /I 2
^(^^)={3 4)(l) =(:) =2ex + 4e2
160
MATRICES AND LINEAR OPERATORS
[CHAP. 7
Observe that the matrix of T in the usual basis is precisely the original matrix A which
defined T. This is not unusual. In fact, we show in the next problem that this is true for any
matrix A when using the usual basis.
(ii) By Problem 7.2, (o, 5) = (26  5o)/i + (3o  h)!^. Hence
^<« = (a X) = © = ''■
and thus
[r]/
/5 8N
V 6 loy
7.7. Recall that any «square matrix A = {an) may be viewed as the linear operator T on
K" defined by T{v) = Av, where v is written as a column vector. Show that the
matrix representation of T relative to the usual basis {et} of K" is the matrix A, that
is, [T]e = A.
I Oil Ol2 • •• «lJl \/ 1
T(ei) = Aei =
T(ei)
Aea =
fail
— OiiBi + 021^2 +
+ a„ie„
<»12
aji2j
+
+ «n2
Oil <*12
r(e„) = Ae„ = I «2i «22
(That is, r(e,) = Ae, is the ith column of A.) Accordingly,
me =
0.11 %2 • • • ''itl \
021 *22 • • • <*2n
, Onl <'^n2
a.
nn J
«ln«l + 02n«2 + ••• + Onnen
= A
7.8. Each of the sets (i) {l,t,e\te*) and (ii) {f^\t^*,t^e^*} is a basis of a vector space V
of functions / : R » R, Let D be the differential operator on V, that is, D{f) — df/dt.
Find the matrix of D in the given basis.
(i) I>(1) = = 0(1) + 0(t) + 0(e«) + O(te')
I?(t) = 1 = 1(1) + 0(t) + 0(et) + O(te')
£>(et) = e* = 0(1) + 0(t) + l(eO + 0(<et)
i)(«e«) = e« + te* = 0(1) + 0(t) + l(e«) + l(tet)
and
[D] =
(ii) 2)(e3«) = 3e»« = 3(e30 + 0(«e3') + 0(t263t)
D(<e30 = eSt + 3«e8t = l(e30 + 3(«e3t) + 0(t2e3t)
2)(t2e3t) = 2teS» + Bt^e^* = 0(e3«) + 2(«e3') + 3(t2e3')
and
[D] =
CHAP. 7] MATRICES AND LINEAR OPERATORS 161
7.9. Prove Theorem 7.1: Suppose {ei, . . .,e„} is a basis of V and T is a linear operator
on F. Then for any vGV, [T]e [v]e = [T{v)]e.
Suppose, for i — 1, . . .,n,
n
Tiei) = Bjiei + ffljaea + • • • + Oi„e„ = 2 %«}
Then [r]e is the nsquare matrix whose ith row is
n
Now suppose V = k^ei + kzBz + • • • + fc„e„ = 2 K^i
Writing a column vector as the transpose of a row vector,
[v], = (&i, fcj, ...,fe„)t (2)
Furthermore, using the linearity of T,
T{v) = T (^ 2 he^ = 2 hned = 2 fci (^ 2 a«e,
n / n \ n
= 2(2 ««*> ) «j = 2 (oijfci + a2jfc2 ^ h a„^fe„)ej
j— 1 \ i=l / i— 1
Thus [r(i;)]g is the column vector whose jth entry is
aijfci + a^jk^ + • • ■ + a„j&„ (^)
On the other hand, the ith entry of [r]e[^]e is obtained by multiplying the ;th row of [T\g by [v]^,
i.e. (1) by (2). But the product of (1) and (2) is (3); hence [r]c[v]e and [T(v)\g have the same entries.
Thus [T], [v], = [T(v%.
7.10. Prove Theorem 7.2: Let {ei, . . . , e„} be a basis of V over X^, and let cA be the algebra
of %square matrices over K. Then the mapping T ^ [T]e is a vector space isomor
phism from A{V) onto cA. That is, the mapping is oneone and onto and, for any
S,T& A{V) and any kGK, [T + S\e = {T]e + [S\e and [kT]e = k[T]e.
The mapping is oneone since, by Theorem 8.1, a linear mapping is completely determined by
its values on a basis. The mapping is onto since each matrix M & cA is the image of the linear
operator ^
F(e^  2 »»«e^ i = l,...,n
i=l
where (wy) is the transpose of the matrix M.
Now suppose, for i = 1, . . . , w,
n n
T{eO = 2 cmej and S{ei) = 2 SijCj
i=i i=i
Let A and B be the matrices A = (ay) and B = (6y). Then [r]^ = A* and [5]^ = B*. We have,
for i = 1, ...,%, „
(r + SKej) = T(ei) + S{ei) = 2K + 6«)ej
Observe that A + J? is the matrix (ay + 6y). Accordingly,
[T + S], = (A+B)t = A' + fit = [r],+ [S]e
We also have, for i = 1, .. .,n,
n n
(fcrXej) = k T(et) = fc 2 ayej = 2 ikaij)ej
Observe that kA is the matrix (fcay). Accordingly,
[kT], = (kA)t = kAt = k[T],
Thus the theorem is proved.
162 MATRICES AND LINEAR OPERATORS [CHAP. 7
7.11. Prove Theorem 7.3: Let {ei, . . . , e„} be a basis of V. Then for any linear operators
S,Te A{V), [ST]e = [S]e [T]e.
n n
Suppose r(ej) = 2 "ij^j and S(ej) = 2 6jk«/c. Let A and B be the matrices A = (ay) and
1=1 fc=l
B = (bjk). Then [T]^ = A* and [S]^  B*. We have
(ST)iei) = S(7'(ei)) = sCSoije,) = 2 a«S(e,)
\i— 1 / i— 1
n/« \ n / n \
= 2 a« ( 2 6ifc6fc ) = 2(2 aijftjic «k
i = l \fc = l / IC=1 \3 = 1 /
n
Recall that AB is the matrix AB = (cjfc) where Cj^ = 2 "■iibjk Accordingly,
J — 1
[ST], = (AB)t = B*At = [S]AT]e
CHANGE OF BASIS, SIMILAR MATRICES
7.12. Consider these bases of R^: {ei = (1,0), cz = (0,1)} and {/i ^ (1,3), /2 = (2,5)}.
(i) Find the transition matrix P from {ei} to {/i}. (ii) Find the transition matrix Q
from {/i} to {ei}. (iii) Verify that Q = P'K (iv) Show that [vy = P^[v]e for any
vector V eR^ (v) Show that [T]f = P'[T]eP for the operator T on R^ defined by
T{x, y) = {2y, Sx  y). (See Problems 7.1 and 7.2.)
(i) /i = (1,3) = lei + 362 ^^^ p = /^ ^
/2 = (2,5) = 261 + 562 \3 5
(ii) By Problem 7.2, (a, 6) = (26  5a)/i + (3a  6)/2. Thus
61 = (1,0) = 5/1 + 3/2 ^^^ Q^/5 2
62 = (0,1) =2/1/2 V 3 1
() ^« = (3 5)(~3 1) = Co 1) " '
/a\ /265a\
(iv) If i; = (a,6), then Me=(j,) and M/ = ( g^^) Hence
P'\vl
5 2\/a\ _ /5a +26
3 1A6/ ~ I 3a 6
/O 2\ /30 48 \
(v) By Problems 7.1 and 7.2; ['ne=(g_^) and [T]f = (^13 29 j" ^""^
7.13. Consider the following bases of R«: {ei = (1,0,0), 62 = (0,1,0), 63 = (0,0,1)} and
{/i = (1,1,1), /2 = (1,1,0), /3 = (1,0,0)}. (i) Find the transition matrix P f rom {ei}
to {/i}. (ii) Find the transition matrix Q from {A} to {ei}. (iii) Verify that Q = P \
(iv) Show that [v]/ = P^[v]e for any vector v G R^ (v) Show that [T]f = P ^[T]eP
for the T defined by T{x, y, z) = {2y + z,x Ay, 3a;). (See Problems 7.4 and 7.5.)
(i) /l = (1,1,1) = Iei+l62+l63
/a = (1, 1, 0) = lei + 1^2 + Oeg and P =
fs = (1,0,0) = lei + 062 + 063
CHAP. 7] MATRICES AND LINEAR OPERATORS 163
(ii) By Problem 7.5, (a, b, c) = cf^ + (b  c)/2 + (a  b)fs. Thus
ei = (1,0,0) = 0/1 + 0/2 + 1/3 /o 1^
62 = (0,1,0) = 0/1 + 1/21/3 and Q = 11
63 = (0,0,1) = 1/11/2 + 0/3 \l 1 0;
'1 1 iWo 1^
(iii) PQ ^ \l 1 1 1 I =
,1 0/\l 1 0^
(iv) It v = (a, 6, c), then [v], = \b\ and [v]f = U _ ^ ) . Thus
\a bi
/O 2 1\ / 3 3 3\
(v) By Problems 7.4(ii) and 7.5, [7]^ = h 4 and [T\f = 6 6 2 . Thus
\3 0/ \ 6 5 1/
/O l\/o 2 l\/l 1 l\ / 3 3 3\
P^[T\eP =0 1114 1 1 = 6 6 2 = m,
\l 1 0/\3 o/\l 0/ \ 6 5 1/
7.14. Prove Theorem 7.4: Let P be the transition matrix from a basis {cj} to a basis {h)
in a vector space V. Then for any vGV, P[v]f = [v]e. Also, [v]f = P^[v]e.
n
Suppose, for i=l,...,n, A = ajiei + 04262 + • • • + aj„e„ = 2 ^0^. Then P is the «square
matrix whose jth row is j=i
(oij, a2j, .... a„j) (i)
n
Also suppose V  kj^ + k2f2+ ■■■ + kj„ = 2 Vj Then writing a column vector as the
transpose of a row vector, *i
[V]f = (fci, ^2, ...,fc„)t (2)
Substituting for /j in the equation for v,
V = 2v, = 2^i(i««e,) = i(«iA)«i
n
= 2 (aijfci + a2jk2 + • • • + a„jkn)ej
Accordingly, [11]^ is the column vector whose jth entry is
aijfci + a2jfc2 + • • • + a„jk„ (s)
On the other hand, the yth entry of Plv]f is obtained by multiplying the ith row of P by [vh, i.e.
(1) by (2). But the product of (1) and (2) is (5); hence P[v]f and [v]^ have the same entries and thus
PMf = Me
Furthermore, multiplying the above by Pi gives P~^[v]e = PiP[v]f = [v]f.
7.15. Prove Theorem 7.5: Let P be the transition matrix from a basis {d} to a basis {/i} in
a vector space F. Then, for any linear operator T on V, [T]t = Pi [T]eP.
For any vector vGV, PHT]^P[v]f = P^[T],[v], = pi[T(v)]^ = [T(v)]f.
164 MATRICES AND LINEAR OPERATORS [CHAP. 7
But [T]f[v]f = [T{v)]f; hence P^[T],P[v]f = [T],[v]f.
Since the mapping v l» [v]f is onto K», Pi[T]^PX = [T]fX for every X £ iC«.
Accordingly, Pi[r],P = [7]^.
7.16. Show that similarity of matrices is an equivalence relation, that is: (i) A is similar
to A; (ii) if A is similar to B, then B is similar to A; (iii) if A is similar to B and B is
similar to C then A is similar to C.
(i) The identity matrix / is invertible and / = /"i. Since A = I^AI, A is similar to A.
(11) Since A is similar to B there exists an invertible matrix P such that A = P^BP. Hence
B = PAPi = (Pi)»APi and P^^ is invertible. Thus B is similar to A.
(iii) Since A is similar to B there exists an invertible matrix P such that A = piPP, and since
B is similar to C there exists an invertible matrix Q such that B = Q^CQ. Hence A =
piBP = P^(Q^CQ)P = (QP)>C(QP) and QP is invertible. Thus A is similar to C.
TRACE
7.17. The trace of a square matrix A = (oij), written tr (A), is the sum of its diagonal
elements: tr (A) = an + ^ + • • • + a„„. Show that (i) tr (AB) = tr (BA), (ii) if A
is similar to B then tr (A) = tr (B).
n
(1) Suppose A = (a„) and B = (fty). Then AP = (ci^) where Cj^ = ^ aij&jfc Thus
n n n
tr(AP) = 2 Cii = 2 2 ttyfeji
i=l i=l }=1
n
On the other hand, BA = (d^^) where dj^ = 2 6ji«Hic Thus
i=l
tr(PA) = 2 dji = 2 2 6ii«« = 2 2aa6;i = tr(AP)
j = l 3=1 i=l >=1 5 = 1
(ii) If A is similar to B, there exists an invertible matrix P such that A = P^BP. Using (i),
tr(A) = tr(PiPP) = tr (PPPi) = tr (P)
7.18. Find the trace of the following operator on R^:
T{x, y, z) = (aiflj + a2y + a^z, bix + h^y + hsz, Cix + Czy + csz)
We first must find a matrix representation of T. Choosing the usual basis {ej,
h 62 &3
Cl C2 C3 /
and tr (T) = tr ([T],) = di + 63 + C3
7.19. Let V be the space of 2 x 2 matrices over R, and let Af = f g ^ j . Let T be the linear
operator on V defined by T{A) = MA. Find the trace of T.
We must first find a matrix representation of T. Choose the usual basis of V:
CHAP. 7] MATRICES AND LINEAR OPERATORS 165
Then
nS,) = ME, = (^l ^)(J °) = (^^ °) = IE, + OE, + 3^3 + 0^4
nE,) = ME, = ^g ^^(^J J^ = ^J J) = 0^1 + IE, + OE, + SE,
T(E,) = ME, = (I ^Y° I) ^ (^ I) ::. 2E, + OE, + 4E, + OE,
3 4/Vl 0/ V4
/I 2\/0 ON /O 2\
T(E,) = ME^ = (^3 J(^^ J = (^^ J = OE, + 2E, + OE, + 4E^
Hence /l 2 0\
10 2
[T]e =
and tr (T) = 1 + 1 + 4 + 4 = 10.
3 4
\0 3 4^
MATRIX REPRESENTATIONS OF LINEAR MAPPINGS
7:20. Let if : R3 ^ R2 be the linear mapping defined by F{x, y, z) = {Sx + 2y4z,x5y + Sz).
(i) Find the matrix of F in the following bases of R* and R^:
{ft = (1, 1, 1), h = (1, 1, 0), fs = (1, 0, 0)}, {9i = (1, 3), g, = (2, 5)}
(ii) Verify that the action of F is preserved by its matrix representation; that is, for
any vGR^ [F]nv]f = [F{v)], .
(i) By Problem 7.2, (a, b) = (26  5a)ffi + (3a  b)g2. Hence
F(/i)  F(l,l,l) = (1,1) = 7g,+ 4g,
7 QQ 1 Q \
4 19 8/
F(fa) = F(1,0,0) = (3,1) = 135^1+ »ff2
(ii) If v(x,y,z) then, by Problem 7.5, v  zf, + {y  z)/, + {x  y)/,. Also,
F(v) = (Sx + 2y4z,x5y + 3z) = (13a;  20y + 26z)gi + (8a; + lly  15z)g2,
, r„, , /13a;  2O3/ + 26z \
and [i^()]a  ( 8, + 11/ 15. )• T^^
7 33 13 \/ ^ /13a;  20j/ + 26«\ ,„, ^,
4 19 SJUH == (8a; + ll.15. ) = t^(^>^'
7.21. Let F:R^^K^ be the linear mapping defined by
F{Xi, X2, . . . , Xn) = {anXi + • • • + amXn, 021X1 + • • • + aanXn, . . . , OmlXi + • • • + amnXn)
Show that the matrix representation of F relative to the usual bases of K" and of K"
is given by
(ttu ai2 ... ttln '
0,21 CI22 ... (l2n
flml (tm2 • . . dmnl
166 MATRICES AND LINEAR OPERATORS [CHAP. 7
That is, the rows of [F] are obtained from the coefficients of the Xi in the components
of F{xi, . . ., x„), respectively.
^(1,0 0) = (ail, aai, ...,(i„i) /«n «12 •■• «in
F{0, 1, . . . , 0) = (ai2, a22' ••■> "m2) , rpi  I "21 «22 • • • «a
F{0,0, . . ., 1) = (ai„, (l2r» • • •> "rnn) y^ml «m2 • • • «tr
7.22. Find the matrix representation of each of the following linear mappings relative to
the usual bases of R":
(i) F : R2 ^ R3 defined by F{x, y) = {Zx y,2x + 4y, 5x  ey)
(ii) F : R* ^ R2 defined by F{x, y, s, t) = (3a; 4:y + 2sU,hx + lys 2t)
(iii) F : R3 ^ R* defined by F{x, y, z) = {2x + Zy%z,x + y + z. Ax  5z, &y)
By Problem 7.21, we need only look at the coefficients of the unknowns in F{x, y, . . .). Thus
(2 3 g\
6 0/
7.23. Let T:R2^R2 be defined by T{x,y) = (2xZy,x + Ay). Find the matrix of T in
the bases {ei = (1, 0), 62 = (0, 1)} and {A = (1,3), ^ = (2,5)} of R^ respectively.
(We can view T as a linear mapping from one space into another, each having its
own basis.)
By Problem 7.2, {a,b) = (265o)/i + (3a6)/2. Then
r(ei) = r(l,0) = (2,1) = 8/1+ 5/2 ^^^ f ^ /8 23
Tie^) = r(0,l) = (3,4) ^ 23/113/2 ^'^ ' \ 5 "13
/ 2 5 —3 \
7.24. Let A = ( . Recall that A determines a linear mapping F.W^B? de
\1 4 7/
fined by F(v) = Av where v is written as a column vector.
(i) Show that the matrix representation of F relative to the usual basis of R^ and
of W is the matrix A itself: [F] = A.
(ii) Find the matrix representation of F relative to the following bases of R^ and R*.
{/i = (1, 1, 1), U = (1, 1, 0), h = (1, 0, 0)}, {^1 = (1, 3), g^ = (2, 5)}
(i)
F(1,0,0) = (1 _4 ^) = (1) = 261 + 162
F(0,1,0) = (j _J ^) 1 = (_J)  561462
/ 2 5 3 \
from which W\ = { , _. „ ) = A (Compare with Problem 7.7.)
(ii) By Problem 7.2, (a, 6) = (26  5a)flri + (ZaV^g^. Then
CHAP. 7]
MATRICES AND LINEAR OPERATORS
167
and [F]«
F(h) =
(I
5
4
?)(;
F{f2) =
il
5
4
F(h) =
il
5
4
12 41
~^^
8 24
5 ■
^(
:) 
 12flri + 8^2
4) =
= 41flri + 24fir2
I) 
' SfiTi + 5fr2
7.25. Prove Theorem 7.12: Let F:y*J] be linear. Then there exists a basis of Y and a
basis of C7 such that the matrix representation A of i^ has the form A = (
where / is the rsquare identity matrix and r is the rank ofF. \
Suppose dim V  m, and dim JJ = n. Let W be the kernel of F and V the image of F. We
are given that rank F = r\ hence the dimension of the kernel of F is m — r. Let {wi, . . .,«)„_ J
be a basis of the kernel of F and extend this to a basis of V:
Set Ml = F('Ui), M2 = ^(va), ..., Mr = F(t)^)
We note that {mi, . . . , m J is a basis of J7', the image of F. Extend this to a basis
{%, . . ., M„ Mr+l, . . ., M„}
of J7. Observe that
F(t;i) = Ml = 1mi + 0^2 +
^(va) = M2 — 0*<i + 1^2 +
+ Om, + Om^+i + •
+ Om^ + Om^+i + •
F(i;,) = M, = Omi + 0^2 +
F(Wi) =0 = Omi + OM2 +
+ 1m^ + OMy+ 1 +
F(w^_r) = = 0% + OM2 + • • • + OWr + Om^+i +
Thus the matrix of F in the above bases has the required form.
+ 0m„
+ 0m„
+ 0m„
+ Om„
+ 0m„
Supplementary Problems
MATRIX REPRESENTATIONS OF LINEAR OPERATORS
7.26. Find the matrix of each of the following linear operators T on R2 with respect to the usual basis
{ei = (1, 0), 62 = (0, 1)}: (i) r(», y) = (2x 3y,x + y), (ii) T{x, y) = (5x + y,Zx 2y).
7.27. Find the matrix of each operator T in the preceding problem with respect to the basis {/i = (1, 2),
/2 = (2, 3)}. In each case, verify that {T\f{v\f  {T(v)]f for any v e R2.
7.28. Find the matrix of each operator T in Problem 7.26 in the basis {g^ = (1, 3), g^ = (1, 4)}.
168 MATRICES AND LINEAR OPERATORS [CHAP. 7
7.29. Find the matrix representation of each of the following linear operators T on R3 relative to the
usual basis:
(i) T(x,y,z) = (x,y,0)
(ii) T{x, y, z) = (2x 7y  Az, 3x + y + 4z, 6a;  83/ + z)
(iii) T(,x, y,z) = (z,y + z, x + y + z)
7.30. Let D be the differential operator, i.e. D(f) — dfldt. Each of the following sets is a basis of a
vector space V of functions / : E ^ R. Find the matrix of D in each basis: (i) {e', e^t, te^'^},
(ii) {sin t, cos t}, (iii) {e5«, te^*, t^e^t}, (iv) {1, t, sin St, cos 3t}.
7.31. Consider the complex field C as a vector space over the real field E. Let T be the conjugation
operator on C, i.e. T(z) = z. Find the matrix of T in each basis: (i) {1, i), (ii) {1 + i, 1 + 2i}.
7.32. Let V be the vector space of 2 X 2 matrices over R and let Af = ( ) . Find the matrix of each
of the following linear operators T on V in the usual basis (see Problem 7.19) of V: (i) T{A) = MA,
(ii) T{A) = AM, (iii) T(A) =^MA AM.
7.33. Let ly and Oy denote the identity and zero operators, respectively, on a vector space V. Show that,
for any basis {ej of V, (i) [1^]^ = I, the identity matrix, (ii) [Oy]^ = 0, the zero matrix.
CHANGE OF BASIS, SIMILAR MATRICES
7.34. Consider the following bases of R^: {e^  (1, 0), eg = (0, 1)} and {/i = (1, 2), ^ = (2, 3)}.
(i) Find the transition matrices P and Q from {gj} to {/J and from {/j} to {ej, respectively.
Verify Q = Pi.
(ii) Show that [v]^ = P[v]f for any vector v G fP.
(iii) Show that [T]f  P'^[T]^P for each operator T in Problem 7.26.
7.35. Repeat Problem 7.34 for the bases {/i = (1,2), /a = (2,3)} and {g^ = (1,3), g^ = (1.4)}.
7.36. Suppose {e^, e^} is a basis of V and T :V ^V is the linear operator for which T^e^) = Se^ — 2e2
and T{e2) = ej + 4e2. Suppose {/i, /a} is the basis of V for which /i = ei + e^ and /z = 2ei + 3e2.
Find the matrix of T in the basis {/i, /j}.
7.37. Consider the bases B — {1, i} and B' = {1 + i, 1 + 2i} of the complex field C over the real field
R. (i) Find the transition matrices P and Q from B to B' and from B' to B, respectively. Verify
that Q = P\ (ii) Show that [T]^, = P'^[T]bP for the conjugation operator T in Problem 7.31.
7.38. Suppose {ej, {/J and {flrj} are bases of V, and that P and Q are the transition matrices from {ej
to {/j} and from {/J to {ffj, respectively. Show that PQ is the transition matrix from {ej to {fTj}.
7.39. Let A be a 2 by 2 matrix such that only A is similar to itself. Show that A has the form
A
Generalize to w X w matrices.
= c :)
7.40. Show that all the matrices similar to an invertible matrix are invertible. More generally, show that
similar matrices have the same rank.
MATRIX REPRESENTATIONS OF LINEAR MAPPINGS
7.41. Find the matrix representation of the linear mappings relative to the usual bases for R":
(i) F : R3 * R2 defined by F{x, y, z) = (2x  4j/ + 9s, 5x + Sy 2z)
(ii) F : Ri! ^ R* defined by F{x, y) = (3a; + 4j/, 5x 2y,x + ly, ix)
(iii) F:R**B, defined by F(x, y, s,t) = 2x + 3y7st
(iv) i^ : R ^ R2 defined by F(x) = (3x, 5x)
CHAP. 7] MATRICES AND LINEAR OPERATORS 169
7.42. Let i<' : R3 ^ R2 be the linear mapping defined by F{x, y, z) = (2x + y — z, Sx — 2y + iz).
(i) Find the matrix of F in the following bases of RS and R^:
{/i = (l,l,l), /2 = (1,1,0), /a = (1,0,0)} and {jti = (1, 3), fir^ = (1, 4)}
(ii) Verify that, for any vector v G R3, [F]^ [v]f = [F{v)]g.
7.43. Let {ej and {/J be bases of V, and let ly be the identity mapping on V. Show that the matrix of
ly in the bases {ej and {/;} is the inverse of the transition matrix P from {e^} to {f^}^, that is,
7.44. Prove Theorem 7.7, page 155. {Hint. See Problem 7.9, page 161.)
7.45. Prove Theorem 7.8. {Hint. See Problem 7.10.)
7.46. Prove Theorem 7.9. {Hint. See Problem 7.11.)
7.47. Prove Theorem 7.10. {Hint. See Problem 7.15.)
MISCELLANEOUS PROBLEMS
7.48. Let r be a linear operator on V and let W be a subspace of V invariant under T, that is,
/A B\
T{W) C W. Suppose dim W — m. Show that T has a matrix representation of the form I . )
where A is an •m X to submatrix. ^
7.49. Let y = 1/ © W, and let U and W each be invariant under a linear operator T : V > V. Suppose
/A ON ,
dim U = m and dim V = n. Show that T has a matrix representation of the form ( I where
B
A and B are mXm and nX n submatrices, respectively. ''
7.50. Recall that two linear operators F and G on y are said to be similar if there exists an invertible
operator T on V such that G = T^FT.
(i) Show that linear operators F and G are similar if and only if, for any basis {ej of V, the
matrix representations [F]^ and [G]g are similar matrices.
(ii) Show that if an operator F is diagonalizable, then any similar operator G is also diagonalizable.
7.51. Two mX n matrices A and B over K are said to be equivalent if there exists an msquare invertible
matrix Q and an nsquare invertible matrix P such that B — QAP.
(i) Show that equivalence of matrices is an equivalence relation.
(ii) Show that A and B can be matrix representations of the same linear operator F :V > U if
and only if A and B are equivalent.
(iii) Show that every matrix A is equivalent to a matrix of the form ( j where / is the rsquare
identity matrix and r = rank A. V '
7.52. Two algebras A and B over a field K are said to be isomorphic (as algebras) if there exists a bijective
mapping / : A * B such that for u,v S A and k G K, (i) f{u + v) = f(u) + f(v), (ii) /(few) = fe/(w),
(iii) f{uv) = f{u)f{v). (That is, / preserves the three operations of an algebra: vector addition, scalar
multiplication, and vector multiplication.) The mapping / is then called an isomorphism of A onto
B. Show that the relation of algebra isomorphism is an equivalence relation.
7.53. Let cA be the algebra of *isquare matrices over K, and let P be an invertible matrix in cA. Show
that the map A \^ P~^AP, where A G c/f, is an algebra isomorphism of a4 onto itself.
170
MATRICES AND LINEAR OPERATORS
[CHAP. 7
Answers to Supplementary Problems
7.26. (i)
/2 3
\1 1
(ii)
6 1
3 2
7.27. Here (a, 6) = (26  3a)/i + (2a  b)!^.
^., ,18 25 \ ^..^ /23 39
« 11 15 j ^"^ ( 15 26
7.28. Here (a, 6) = (4a  h)gi + (6  Za)g2.
32 45
« (25 35/ (") V27 32
35 41
7.29. (i)
10
10
,0 0,
'2 7 4I
3 14
68 1 (
(iii)
7.30. (i)
1 0'
2 1
,0 2,
(ii)
(iii)
5 101
5 2
,0 5,
(iv)
'0 1 o\
03
iO 3 0/
7.31. (i)
1
1
(ii)
3 4
2 3
7.32. (i)
(ii)
(iii)
^c
h
6
ad
6
c
d — a
— c
7.34. P =
Q =
3
2
7.35. P =
3 5
1 2
Q =
2 5
1 3
7.36.
8 11
2 1
7.37. P =
2 1
1 1
7.41. (i)
4 9
3 2
(iii) (2,3,7,1) (iv)
7.42. (i)
3 11
1 8
chapter 8
Determinants
INTRODUCTION
To every square matrix A over a field K there is assigned a specific scalar called the
determinant of A; it is usually denoted by
det(A) or A
This determinant function was first discovered in the investigation of systems of linear
equations. We shall see in the succeeding chapters that the determinant is an indispensable
tool in investigating and obtaining properties of a linear operator.
We comment that the definition of the determinant and most of its properties also apply
in the case where the entries of a matrix come from a ring (see Appendix B).
We shall begin the chapter with a discussion of permutations, which is necessary for
the definition of the determinant.
PERMUTATIONS
A onetoone mapping <7 of the set {1,2, . . .,«} onto itself is called a permutation. We
denote the permutation <r by
1 2 ... n\ . . . , ....
. . . or <T — 3i32 ■ ■ ■ 3n, where 3i = cr(i)
h H . • . JnJ
Observe that since o is onetoone and onto, the sequence /i J2 . . . jn is simply a rearrange
ment of the numbers 1, 2, . . . , «. We remark that the number of such permutations is n !,
and that the set of them is usually denoted by S„. We also remark that if <7 € /S„, then the
inverse mapping cr"^ G S„; and if a.rGSn, then the composition mapping oorGSn. In
particular, the identity mapping
belongs to Sn. (In fact, e — 12...n.)
Example 8.1 : There are 2 ! = 2 • 1 = 2 permutations in Sg: 12 and 21.
Example 8.2: There are 3! = 3'2'1 = 6 permutations in S3: 123, 132, 213, 231, 312, 321.
Consider an arbitrary permutation a in Sn. a = ji jz . . . jn. We say a is even or odd
according as to whether there is an even or odd number of pairs (i, k) for which
i> k but i precedes A; in er (*)
We then define the sign or parity of a, written sgn <t, by
r 1 if <7 is even
sgna = J.
[—1 if a IS odd
171
172
DETERMINANTS
[CHAP. 8
Example 8.3: Consider the permutation a — 35142 in S5.
3 and 5 precede and are greater than 1; hence (3, 1) and (5, 1) satisfy (*).
3, 5 and 4 precede and are greater than 2; hence (3, 2), (5, 2) and (4, 2) satisfy (*).
5 precedes and is greater than 4; hence (5, 4) satisfies (*).
Since exactly six pairs satisfy (*), a is even and sgn <r = 1.
Example 8.4: The identity permutation e = 12 . . . n is even since no pair can satisfy (*).
Example 8.5 : In S2, 12 is even, and 21 is odd.
In S3, 123, 231 and 312 are even, and 132, 213 and 321 are odd.
Example 8.6: Let t he the permutation which interchanges two numbers i and ; and leaves the
other numbers fixed:
r{i) = j, r(j) = i, T(fc) = k, k^ i, j
We call T a transposition. If i < j, then
T = 12 ...{il)}{i+l) ... ijl)i{j+l) ...n
There are 2{j — i — l) + l pairs satisfying (*):
(j,i), (h^)! i^t i), where x = i+1, . . .,j—l
Thus the transposition t is odd.
DETERMINANT
Let A — (Oij) be an «square matrix over a field K:
jdll (tl2 ... Oln^
A _ I 021 0.22 ■ . ■ 0'2n
\a„i
CLn2
Consider a product of n elements of A such that one and only one element comes from each
row and one and only one element comes from each column. Such a product can be written
in the form
ftlil 0.212 ■ ■ ■ ^^'n
that is, where the factors come from successive rows and so the first subscripts are in the
natural order 1,2, . . .,n. Now since the factors come from different columns, the sequence
of second subscripts form a permutation <t = ji 32 . . . jn in Sn. Conversely, each permuta
tion in Sn determines a product of the above form. Thus the matrix A contains n\ such
products.
Definition: The determinant of the wsquare matrix A = (Odi), denoted by det(A) or A,
is the following sum which is summed over all permutations o = ji jz . . . h
in Sn.
\A\ = 2/ (sgn o)a'UiO,2j^ . . . a„j„
That is,
2 (sgn C7)aicr(l) tt2(T(2) • • • Onain)
The determinant of the wsquare matrix A is said to be of order n and is frequently
denoted by
ail ai2
021 0.22
Onl 0,n2
ttln
0.2n
CHAP. 8]
DETERMINANTS
173
We emphasize that a square array of scalars enclosed by straight lines is not a matrix but
rather the scalar that the determinant assigns to the matrix formed by the array of scalars.
Example 8.7:
Example 8.8:
Example 8.9:
The determinant of a 1 X 1 matrix A — (an) is the scalar on itself: \A\ = On.
(We note that the one permutation in Sj is even.)
In 1S2, the permutation 12 is even and the permutation 21 is odd. Hence
''■ll'*22 '*12'*21
Thus
"H «12
01^ a^i
4
1
5
2
= 4(
2)  (5
13 and
a b
e d
= ad — be.
In 1S3, the permutations 123, 231 and 312 are even, and the permutations 321, 213 and
132 are odd. Hence
On ai2 ai3
021 '*22 "•23
131 *32 <*33
This may be written as:
Oll(<l22'*33 ~ ''■23'*32)
<*ll'*22'*33 + <*i2(l23a3l + di^a^ia^i
— <Z'X3a220'3l — ttl2<*2l'>33 "" <*ll''23*32
*12("21<'33 ~ ®23*3l) + '*13('''2l'''32 ~ <*22'*3l)
a22 «23
~ «12
«21 "^23
+ «13
021 022
O32 '''33
«31 "33
O3I 032
which is a linear combination of three determinants of order two whose coefficients
(with alternating signs) form the first row of the given matrix. Note that each
2X2 matrix can be obtained by deleting, in the original matrix, the row and column
containing its coefficient:
«ii
ttll «12 Oi3
«.l1
"23
<*33
 "12
<»11 »I2 Oi3
"21 'h.i «2:i
031 "■gi '*33
+ «13
a31 Itj., 033
Example 8.10: (i)
(ii)
2
3
4
6
7
5
7
5
6
6
6
7
= 2
 3
+ 4
9
1
8
1
8
9
8
9
1
2(663)  3(556) + 4(4548)
27
2 3
4
—4
v
2
4
4
2
= 2
1
5
 3
1
5
+ (4)
1 1
1 1
5
2(20 + 2)
As n increases, the number of terms in the determinant becomes astronomical. Accord
ingly, we use indirect methods to evaluate determinants rather than its definition. In fact
we prove a number of properties about determinants which will permit us to shorten the
computation considerably. In particular, we show that a determinant of order n is equal
to a linear combination of determinants of order m. — 1 as in case n = 3 above.
PROPERTIES OF DETERMINANTS
We now list basic properties of the determinant.
Theorem 8.1: The determinant of a matrix A and its transpose A* are equal: \A\ = \A*\.
174 DETERMINANTS [CHAP. 8
By this theorem, any theorem about the determinant of a matrix A which concerns the
rows of A will have an analogous theorem concerning the columns of A.
The next theorem gives certain cases for which the determinant can be obtained
immediately.
Theorem 8.2: Let A be a square matrix.
(i) If A has a row (column) of zeros, then \A\ = 0.
(ii) If A has two identical rows (columns), then A = 0.
(iii) If A is triangular, i.e. A has zeros above or below the diagonal, then
\A\ = product of diagonal elements. Thus in particular, / = 1 where
/ is the identity matrix.
The next theorem shows how the determinant of a matrix is affected by the "elementary"
operations.
Theorem 8.3: Let B be the matrix obtained from a matrix A by
(i) multiplying a row (column) of A by a scalar fc; then B = feA.
(ii) interchanging two rows (columns) of A; then Z?j = — A.
(iii) adding a multiiJle of a row (column) of A to another; then jB] = A.
We now state two of the most important and useful theorems on determinants.
Theorem 8.4: Let A be any nsquare matrix. Then the following are equivalent:
(i) A is invertible, i.e. A has an inverse A~^.
(ii) A is nonsingular, i.e. AX  has only the zero solution, or rank A = n,
or the rows (columns) of A are linearly independent,
(iii) The determinant of A is not zero: A ¥= 0.
Theorem 8.5: The determinant is a multiplicative function. That is, the determinant of
a product of two matrices A and B is equal to the product of their deter
minants: \A B\ = \A\ \B\ .
We shall prove the above two theorems using the theory of elementary matrices (see
page 56) and the following lemma.
Lemma 8.6: Let E be an elementary matrix. Then, for any matrix A, \E A\  \E\\A\.
We comment that one can also prove the preceding two theorems directly without
resorting to the theory of elementary matrices.
MINORS AND COFACTORS
Consider an wsquare matrix A = (ay). Let M« denote the (w l)square submatrix of
A obtained by deleting its ith row and .7th column. The determinant \Mij\ is called the mmor
of the element ay of A, and we define the cof actor of Oni, denoted by A«, to be the "signed"
^^""^= A« = iiy^^m
Note that the "signs" (1)*+^' accompanying the minors form a chessboard pattern with
+'s on the main diagonal:
■ ; : ; ::\
+  + ■■
We emphasize that My denotes a matrix whereas Ay denotes a scalar.
CHAP. 8]
DETERMINANTS
175
Example 8.11: Let A =
2 3 4
Then M23 = { 5 6 7
8 9 1
2 3
8 9
and
= (1)2
2 3
8 9
= (1824) = 6
The following theorem applies
Theorem 8.7
The determinant of the matrix A = (Odj) is equal to the sum of the products
obtained by multiplying the elements of any row (column) by their re
spective cofactors: „
\A\ = OiiAii + ai2Ai2 + • • • + ttinAin = 2 o^a^a
j=i
and
CLljAij + Ct2i'4.2j + • • • + anjAnj — ^ OijAij
The above formulas, called the Laplace expansions of the determinant of A by the ith.
row and the yth column respectively, offer a method of simplifying the computation of \A\.
That is, by adding a multiple of a row (column) to another row (column) we can reduce A
to a matrix containing a row or column with one entry 1 and the others 0. Expanding by
this row or column reduces the computation of \A\ to the computation of a determinant of
order one less than that of \A\.
5 4 2 l\
2 3 1 —2
Example 8.12: Compute the determinant of A = '
5 7 3
1 2 1
Perform the following
Note that a 1 appears in the second row, third column,
operations on A, where fij denotes the ith row:
(i) add 2R2 to jRi, (ii) add 3^2 to Ra, (iii) add li?2 to R^.
By Theorem 8.3(iii), the value of the determinant does not change by these opera
tions; that is.
\A\ =
Now if we expand by the third column, we may neglect all terms which contain 0.
Thus
5 4 2 1
2 3 12
5 73 9
121 4
1
2
5
2
3
1
2
1
2
3
3
1
2
\A\ = (1)2
{
1
_2
5
2
3
1
2
1
2
3
3
1
2
12 5
= 123
3 12
2 3
1 2
 (2)
1 3
3 2
+ 5
1 2
3 1
} =
38
CLASSICAL ADJOINT
Consider an nsquare matrix A = (an) over a field K:
A =
' an ai2
0,21 ffl22
ttin
tt2re
1 fflnl ffln2
176
DETERMINANTS
[CHAP. 8
The transpose of the matrix of cofactors of the elements Oij of A, denoted by adj A, is called
the classical adjoint of A:
'An A21 ... A„i \
adj A  '^^^ ^^^ ••• ^"^
■^nn I
We say "classical adjoint" instead of simply "adjoint" because the term adjoint will be used
in Chapter 13 for an entirely different concept.
Example 8.13: Let A 
An = +
■^21 — —
A31 = +
The cofactors of the nine elements of A are
4 2
1 5
3 4
1 5
3 4
4 2
= 18,
A12 —
1
2
5
= 11,
A22  +
2
1
4
5
= 10,
A32 ~ —
2
4
2
= 2, Ai3 = +
= 14, A23 = 
= 4, A33 = +
4
1 l
2 3
1 1
2 3
4
= 4
= 5
= 8
Theorem 8.8:
We form the transpose of the above matrix of cofactors to obtain the classical adjoint
of A:
I 18 11 10 \
adj ^ = 2 144
\ 4 58/
For any square matrix A,
A (adj A) = (adj A) A = \A\I
where / is the identity matrix. Thus, if \A\¥' 0,
Observe that the above theorem gives us an important method of obtaining the inverse
of a given matrix.
Example 8.14: Consider the matrix A of the preceding example for which \A\ — —46. We have
/2 3 4\ /18 11 10\ /46 0\ /l 0^
A (adj A) = 4 2 2 14 4 = 046 = 46 1
\l 1 5/\ 4 5 8/ \ 46/ \0 1^
= 46/ = A/
We also have, by Theorem 8.8,
18/46 11/46 10/46\ / 9/23 11/46 5/23 \
Ai = r4T(adjA)
2/46 14/46 4/46 = 1/23 7/23 2/23
4/46 5/46 8/46/ \2/23 5/46 4/23/
APPLICATIONS TO LINEAR EQUATIONS
Consider a system of n linear equations in n unknowns:
anX\ + ai2a;2 + • • • + ai„a;» = bi
a2\X\ + a22a;2 + • • • + ainXn — &2
anliCi + an%X2 +
"T annXn
bn
CHAP. 8] DETERMINANTS 177
Let A denote the determinant of the matrix A — (oij) of coefficients: A = \A\. Also, let As
denote the determinant of the matrix obtained by replacing the ith column of A by the
column of constant terms. The fundamental relationship between determinants and the
solution of the above system follows.
Theorem 8.9: The above system has a unique solution if and only if A ?^ 0. In this case
the unique solution is given by
_ Al _ ^ _ An
The above theorem is known as "Cramer's rule" for solving systems of linear equations.
We emphasize that the theorem only refers to a system with the same number of equations
as unknowns, and that it only gives the solution when A ^ 0. In fact, if A = the theorem
does not tell whether or not the system has a solution. However, in the case of a homo
geneous system we have the following useful result.
Theorem 8.10: The homogeneous system Ax — has a nonzero solution if and only if
A = A = 0.
\2x — Zy = 1
Example 8.15: Solve, using determinants: <
3a; + 52/ = 1
First compute the determinant A of the matrix of coefficients:
2 3
3 5
Since A ^^ 0, the system has a unique solution. We also have
A =
= 10 + 9 = 19
A. =
7 3
1 5
= 38,
2 7
3 1
19
Accordingly, the unique solution of the system is
^x 38  ^y 19
*' = T"i9 = 2, ^ = T=l9 = i
We remark that the preceding theorem is of interest more for theoretical and historical
reasons than for practical reasons. The previous method of solving systems of linear equa
tions, i.e. by reducing a system to echelon form, is usually much more efficient than by using
determinants.
DETERMINANT OF A LINEAR OPERATOR
Using the multiplicative property of the determinant (Theorem 8.5), we obtain
Theorem 8.11: Suppose A and B are similar matrices. Then A = \B\.
Now suppose T is an arbitrary linear operator on a vector space V. We define the
determinant of T, written det (T), by
det(r) = [r]e
where [T]e is the matrix of T in a basis {et}. By the above theorem this definition is in
dependent of the particular basis that is chosen.
The next theorem follows from the analogous theorems on matrices.
Theorem 8.12: Let T and iS be linear operators on a vector space V. Then
(i) det (S o T) = det {S) • det {T),
(ii) r is invertible if and only if det (7)^0.
178
DETERMINANTS
[CHAP. 8
We also remark that det (Iv) = 1 where Iv is the identity mapping, and that det (T~^) 
det(r)i if T is invertible.
Example 8.16: Let T be the linear operator on R3 defined by
T(x, y, z) — (2x — 4y + z, X — 2y + 3z, 5x + y — z)
'2 4 l\
The matrix of T in the usual basis of R3 is [T] = \ X 2 3 . Then
,5 11/
det(r) =
24 1
12 3
5 11
2(2  3) + 4(l  15) + 1(1 + 10) = 55
MULTILINEARITY AND DETERMINANTS
Let cA denote the set of all nsquare matrices A over a field K. We may view A as an
ntuple consisting of its row vectors Ai, A2, . . ., A„:
A = {Ai, A2, . . ., An)
Hence cA may be viewed as the set of ntuples of wtuples in K:
The following definitions apply.
Definition: A function D.cA ^ K is said to be multilinear if it is linear in each of the
components; that is:
(i) if row Ai = B + C, then
D{A) = D(...,B + C,...) = D{...,B, ...) + D(...,C, ...);
(ii) if row Ai = kB where k G K, then
D{A) = D(...,kB, ...) = kD{...,B, ...).
We also say nlinear for multilinear if there are n components.
Definition: A function D.cA^K is said to be alternating if D{A) = whenever A has
two identical rows:
D{Ai, A2, . . ., An) = whenever A,  Aj, i¥^ j
We have the following basic result; here / denotes the identity matrix.
Theorem 8.13: There exists a unique function D.oA *K such that:
(i) D is multilinear, (ii) D is alternating, (iii) D{I) = 1.
This function D is none other than the determinant function; that is, for
any matrix A^cA, D(A) = jA.
CHAP. 8]
DETERMINANTS
179
Solved Problems
COMPUTATION OF DETERMINANTS
8.1. Evaluate the determinant of each matrix: (i)
(i)
3 2
4 5
= 35 (2) 4 = 23.
(ii)
^4 5
a — b a
a a + b
(ii)
a — b a
a a + b
= (a—b)(a+b) — a'a
62.
8.2. Determine those values of k for which
k k
4 2k
k k
4 2k
fe = 2, the determinant is zero.
= 0.
= 2A;2  4fc = 0, or 2fc(fe  2) = 0. Hence k  Q; and k = 2. That is, if fe = or
8.3. Compute the determinant of each matrix:
'1 2 3\ /2 l\ / 2 1^
(i) [42 3
^2 5 ly
(i)
1
2
3
2
3
4
3
4
2
4
2
3
= 1
 2
+ 3
5
1
2
1
2
5
2
5
1
(ii)
(iii)
(iv)
1(215)  2(46) + 3(20 + 4) = 79
4 2
2
1
2
3
4
3
4
2
3
= 2
3
1

5
1
5
3
1
+ ll
6 3
2 1
3 23
1 3 5
10
3 24
4 13
= 2(109) + l(9 + 2) = 5
= 1(6 + 4) = 10
= 24
0\
(ii) (4 2 3 ) , (iii) ( 3 2 3 I , (iv) I 3 2 4  .
\l 3
8.4. Consider the 3square matrix A — \a2 &2 C2
\a3 bs cs
be used to obtain the determinant of A:
Show that the diagrams below can
Form the product of each of the three numbers joined by an arrow in the diagram on the left,
and precede each product by a plus sign as follows:
180
DETERMINANTS
[CHAP. 8
Now form the product of each of the three numbers joined by an arrow in the diagram on the
right, and precede each product by a minus sign as follows:
— asftgCi — bgC^ai — Cgagfei
Then the determinant of A is precisely the sum of the above two expressions:
«! 61 Ci
\A\ =
a2 62 "2
«3 &3 "3
The above method of computing \A\ does not hold for determinants of order greater than 3.
8.5. Evaluate the determinant of each matrix:
/2 l\ /a b c>
(i)
3 2
A 3 7/
(ii)
(iii)
/ 3 2 
102
\2 3 3/
(i) Expand the determinant by the second column, neglecting terms containing a 0:
2 01
3 2
43 7
= (3)
2 1
3 2
3(4 + 3)
21
(ii) Use the method of the preceding problem:
a
b
c
c
a
b
b
c
a
= a^ + 63 + c* — abc — abc — abc =
63 + c3
3a6c
(iii) Add twice the first column to the third column, and then expand by the second row:
3 2 2
10
2 3 1
3 24 + 2(3)
1 2 + 2(1)
2 3 3 + 2(2)
2 2
3 1
= 8
fi 1
8.6. Evaluate the determinant of A =
4
First multiply the first row by 6 and the second row by 4. Then
362 3 6 + 4(3) 2  (3)
3 24 = 1 2 + 4(3) 4 (3)
14 1 1 4 + 4(1) 1~(1)
64A  24A =
= +
6
14
28,
and \A\ = 28/24 = 7/6.
3 65
1 14 7
10
8.7. Evaluate the determinant of A
Note that a 1 appears in the third row, first column. Apply the following operations on A
(where i2j denotes the ith row): (i) add —2R3 to iSj, (ii) add 2R3 to i?2, (iii) add IB3 to R^. Thus
CHAP. 8]
DETERMINANTS
181
\A\ =
2
5
3
2
2
3
2
5
1
3
2
2
1
6
4
3
1
1
6
3
2
1
= +
1
3
2
2
3
2
5
1 + 1 1 6 + 6(1)
32 2 1 + 6(2)
3 + 2 2 5 + 6(2)
1 1 6
3 2 1
3 2 5
1
1
13
1
2
13
— —
1
17
1
2
17
= 4
8.8. Evaluate the determinant of A
First reduce A to a matrix which has 1 as an entry, such as adding twice the first row to the
second row, and then proceed as in the preceding problem.
A =
3
2
5
4
■5
2
8
5
2
4
7
3
2
3
5
8
3
2
5
4
1
2
2
3
2
4
7
3
2
3
5
8
3
4
1
5
1
2
3
3
2
1
1
2
4 16
3
113
32 5 4
5 + 2(3) 2 + 2(2) 8 + 2(5) 5 + 2(4)
2 4 73
235 8
3 2 + 2(3) 5 + 2(3) 43(3)
1 2 + 2(1) 2 + 2(1) 33(1)
2 4 + 2(2) 7 + 2(2) 33(2)
2 3 + 2(2) 5 + 2(2) 83(2)
4 1 5(l)
3 3  (3)
1 1 2(l)
4
1
5
3
3
= —
1
1
2
4 6
1 3
3(12 + 6) = 54
/t + S 1 1
8.9. Evaluate the determinant of A = I 5 tB 1
\ 6 6 t + 4/
Add the second column to the first column, and then add the third column to the second column
to obtain
t + 2 1
\A\ = t + 2 t2 1
t2 t + 4
Now factor t + 2 from the first column and t — 2 from the second column to get
10 1
A = (t + 2)(t2) 11 1
1 « + 4
Finally subtract the first column from the third column to obtain
\A\ = (< + 2)(t2)
10
11
1 t + 4
= (t+2)(t2)(t + 4)
182
DETERMINANTS
[CHAP. 8
COFACTORS
8.10. Find the cofactor of the 7 in the matrix
2
1
3
■1
5
4
7
2
^
4
6
3
3
2
5
2
2
1
4
2
1 4
/
4 3
4
3
rz —
4
3
7 10
3
2
2
7
10
\
(1)2
The exponent 2 + 3 comes from the fact that 7 appears in the second row, third column.
61
/I 2 3\
8.11. Consider the matrix A = 2 3 4 , (i) Compute A , (ii) Find adj A. (iii) Verify
\l 5 7/
A • (adj A) = \A\ I. (iv) Find A\
(i) \A\ = 1
3 4
5 7
2 4
1 7
+ 3
2 3
1 sl
= 120 + 21 = 2
(ii) adj A
+
3 4
5 7

2 4
1 7
+
2 3
5 7
+
1 3
1 7
+
2 3
3 4

1 3
2 4
+
That is adj A is the transpose of the matrix of cof actors. Observe that the "signs" in the
u  +\
matrix of cofactors form the chessboard pattern — + — .
\+  +/
(iii) A '(adj A) =
(iv) Ai = ]4i(adjA)
= A/
a b
c d
8.12. Consider an arbitrary 2 by 2 matrix A •
(i) Find adj A. (ii) Show that adj (adj A) = A.
(X) adj A = (^_^ ^1^1 j  \^_^ „;  (^_, „;
/ +la lciy ^ /a c
 \^__6 +di; v'' ^
(ii) adj (adj A) = adj
d b
e a
CD
= A
CHAP. 8]
DETERMINANTS
183
DETERMINANTS AND SYSTEMS OF LINEAR EQUATIONS
8.13. Solve for x and y, using determinants:
2a; + y = 1 , __ ax — 2hy — c
(i)
3x — 5y — 4
2 1
3 5
y = A„/A = 1.
(ii) , where ab ¥^ 0.
Sax — 5by — 2c
13, A:, =
7 1
4 5
= 39, Ay =
2 7
3 4
= 13. Then x = A^/A = 3,
(ii) A =
= a6, Ax =
a 26
3a 56
—c/a, y — Aj,/A = —c/b
c 26
2c 56
be, Aj, =
a. c
3a 2c
= — ac. Then a; = A^M =
8.14. Solve using determinants:
Sy + 2x = z + 1
3a; + 22;  S  5y .
3z  1 = a;  2i/
First arrange the system in standard form with the unknowns appearing in columns:
2x + 3y  z = 1
3x + 5y + 2z  8
X 2y  3z = 1
Compute the determinant A of the matrix A of coefficients:
2 31
3 5 2
A =
2 3
= 2(15 + 4)  3(92)  l(65) = 22
Since A t^ 0, the system has a unique solution. To obtain A^., A,, and A^, replace the coefficients of
the unknown in the matrix A by the column of constants. Thus
A, =
131
8 5 2
1 2 3
= 66, A„ =
2 11
3 8 2
1 1 3
22, A, =
and X  Aj./A = 3, 2/ = Aj,/A = 1, z = A^/A = 2.
2 3 1
3 5 8
1 2 1
= 44
PROOF OF THEOREMS
8.15. Prove Theorem 8.1: A* = A.
Suppose A = (tty). Then A* = (6jj) where 6y = a^;. Hence
l^'l = 2 (Sgn a) 6io.(i) 62,^(2) . . . 6„o(n)
aes„
= 2 (sgn ff)a„(i).ia<r(2),2 ••• «o(n),n
Let T = <r~i. By Problem 8.36, sgn r = sgn a, and
Hence A« = 2 (sgn t) ai^d) a2T(2) • • • «nT(n)
<Tes„
However, as a runs through all the elements of S„, t = <r~i also runs through all the elements of
S„. Thus \At\ = A.
8.16. Prove Theorem 8.3(ii): Let B be obtained from a square matrix A by interchanging
two rows (columns) of A. Then B = — A.
We prove the theorem for the case that two columns are interchanged. Let t be the trans
position which interchanges the two numbers corresponding to the two columns of A that are
interchanged. If A = (oy) and B — (6jj), then 6y — Ojtcj) Hence, for any permutation a.
184 DETERMINANTS [CHAP. 8
Thus \B\ = 2 (sgn ff)6io.(i) 620.(2) ... &no(n)
= 2 (sgn a) OiTirCl) "■2to(2) • • • "nraCn)
Since the transposition t is an odd permutation, sgn ra = sgn r • sgn <r = — sgn a. Thus sgn a =
— sgn TO, and so
\B\ = — 2 (sgn TOr) a.iTO(l)a2T<r(2) • • • "^nrcrCn)
But as a runs through all the elements of S„, to also runs through all the elements of S^; hence
\B\ = \A\.
8.17. Prove Theorem 8.2: (i) If A has a row (column) of zeros, then \A\ = 0. (ii) If A has
two identical rows (columns), then \A\  0. (iii) If A is triangular, then \A\ = product
of diagonal elements. Thus in particular, / = 1 where / is the identity matrix.
(i) Each term in A contains a factor from every row and so from the row of zeros. Thus each
term of A is zero and so \A\ = 0.
(ii) Suppose 1 + 1 # in K. If we interchange the two identical rows of A, we still obtain the
matrix A. Hence by the preceding problem, 1A = — A and so \A\ = 0.
Now suppose 1 + 1 = in K. Then sgn <r = 1 for every a e S„. Since A has two iden
tical rows, we can arrange the terms of A into pairs of equal terms. Since each pair is 0, the
determinant of A is zero.
(iii) Suppose A = (ay) is lower triangular, that is, the entries above the diagonal are all zero:
a^j = whenever i < j. Consider a term t of the determinant of A:
t = (sgn <r) aiij a2i2 • • • '*"*n' where <t = i^H ...in
Suppose ii ¥ 1. Then 1< ii and so a^^ = 0; hence t = 0. That is, each term for which
ii 7^ 1 is zero.
Now suppose ti = 1 but iz ¥ 2. Then 2 < ig and so a^ = 0; hence t = 0. Thus each
term for which ij ^ 1 or 12 ^ 2 is zero.
Similarly we obtain that each term for which ij 7^ 1 or % # 2 or ... or t„ 9^ n is zero.
Accordingly, 1A = a^^a^^ . . . a^n = product of diagonal elements.
8.18. Prove Theorem 8.3: Let B be obtained from A by
(i) multiplying a row (column) of A by a scalar fe; then B = fe A .
(ii) interchanging two rows (columns) of A; then B =  A.
(iii) adding a multiple of a row (column) of A to another; then B1 = \A\.
(i) If the jth row of A is multiplied by fc, then every term in A is multiplied by fc and so
\B\ = k\A\. That is,
B = 2 (sgn o) an a2t2 • • • C^^jiP • • • «ni„
= fc 2 (sgn a) aii a2i2 . . . Oni„ = ^ 1^1
a
(ii) Proved in Problem 8.16.
(iii) Suppose c times the feth row is added to the jth row of A. Using the symbol /\ to denote the
yth position in a determinant term, we have
\B\ = 2 (sgn or) aii aji^ . . . {ca^i^ + ajj.) . . . a„i^
= c 2 (sgn <r) a„j agi^ • . • «fci^ • • ■ ««!„ + 2 (sgn c) a^i^ a^i^. . . a^. . . . a„i^
The first sum is the determinant of a matrix whose feth and ;th rows are identical;
hence by Theorem 8.2(ii) the sum is zero. The second sum is the determinant of A. Thus
B = c'0 + 1A = A.
CHAP. 8] DETERMINANTS 185
8.19. Prove Lemma 8.6: For any elementary matrix £", l^'A] = IE"! A,
Consider the following elementary row operations: (i) multiply a row by a constant A; # 0;
(ii) interchange two rows; (iii) add a multiple of one row to another. Let E^, JS?2 and E^ be the
corresponding elementary matrices. That is, Sj, E^ and E^ are obtained by applying the above
operations, respectively, to the identity matrix /. By the preceding problem,
l^il = k\I\ = k, \E^\ = \I\ = 1, \E,\ = / = 1
Recall (page 56) that SjA is identical to the matrix obtained by applying the corresponding
operation to A. Thus by the preceding problem,
\E^A\ = k\A\ = \Ei\\A\, lE^A] = \A\ = l^^l lA], \E,A\ = \A\ = 1A = I^g] 1A
and the lemma is proved.
8.20. Suppose B is row equivalent to A; say B = EnEni . . . E2E1A where the E, are
elementary matrices. Show that:
(i) \B\ = \En\ \Er,i\ . . . \E2\ \Ei\ \A\, (ii) \B\ ^ if and only if \A\ ^ 0.
(i) By the preceding problem, J7iA = Bi 1A. Hence fey induction,
\B\ = \E„\\E„_,...E2E,A\ = \E^\\E,_,\...\E2\\E,\\A\
(ii) By the preceding problem, ^i ^ for each i. Hence \B\ ¥= if and only if \A\ ¥ 0.
8.21. Prove Theorem 8.4: Let A be an wsquare matrix. Then the following are equivalent:
(i) A is invertible, (ii) A is nonsingular, (iii) A 9^ 0.
By Problem 6.44, (i) and (ii) are equivalent. Hence it suffices to show that (i) and (iii) are
equivalent.
Suppose A is invertible. Then A is row equivalent to the identity matrix /. But / ■?* 0; hence
by the preceding problem, A ^ 0. On the other hand, suppose A is not invertible. Then A is row
equivalent to a matrix B which has a zero row. By Theorem 8.2(i), \B\ = 0; then by the preceding
problem, \A\ = 0. Thus (i) and (iii) are equivalent.
8.22. Prove Theorem 8.5: \AB\ = \A\\B\.
If A is singular, then AB is also singular and so \AB\ = = A B. On the other hand if A
is nonsingular, then A =^ E^ . . . E2E1, a product of elementary matrices. Thus, by Problem 8.20,
A = \E^...E^E,I\ = \E„\...\E2\\E,\\I\ = \EJ...\E2\\E,\
and so AJ5 = \E„...E2E,B\ = \EJ . . . lE^WE^WB] = A B
8.23. Prove Theorem 8.7: Let A = (a«); then \A\ = anAn + ttizAia + • • • + ai„Ai„, where
Aij is the cofactor of an.
Each term in \A\ contains one and only one entry of the ith row (aij.Ojg, . . ., a,„) of A. Hence
we can write \A \ in the form
A = ajiAfi + ai2A*2 + ■■■ + ai„Af„
(Note Ay is a sum of terms involving no entry of the ith row of A.) Thus the theorem is proved if
we can show that
At. = A;,. = (l)«+iM„I
where Afy is the matrix obtained by deleting the row and column containing the entry ay. (His
torically, the expression Ay was defined as the cofactor of Oy, and so the theorem reduces to showing
that the two definitions of the cofactor are equivalent.)
186 DETERMINANTS [CHAP. 8
First we consider the case that i = n, j = n. Then the sum of terms in \A\ containing a„„ is
Orm'^nn = «nn 2 (sgn a) ffli<r(i) 02<t(2) • • • <»nl,cr(nl)
a
where we sum over all permutations aSS„ for which ain) = n. However, this is equivalent (Prob
lem 8.63) to summing over all permutations of {1, . . .,nl}. Thus A*„ = M„„ = (!)»+« 1M„„ .
Now we consider any i and }. We interchange the ith. row with each succeeding row until it is
last, and we interchange the jth column with each succeeding column until it is last. Note that the
determinant Afy is not affected since the relative positions of the other rows and columns are not
affected by these interchanges. However, the "sign" of A and of Ay is changed n — i and then
n — j times. Accordingly,
A% = (l)»i + »i M„ = (l)* + MMy
8.24. Let A = (an) and let B be the matrix obtained from A by replacing the ith row of A
by the row vector (bn, . . . , &m). Show that
B = biiAn + bt2Aa + • • • 4 &i„Ai„
Furthermore, show that, for j ¥= i,
ajiAn + (lizAti + • • • + ttjnAin —
and aijAii + 023^2! + • • • 4 a„iA„i =
Let B = (6y). By the preceding problem,
\B\ = ftiiBa + 6i2^i2 + ■ • • + 6i„Bi„
Since By does not depend upon the ith row of B, By = Ay for j = 1 n. Hence
\B\ = ftjiAji + 6i2Ai2 + • • • + 6i„Ai„
Now let A' be obtained from A by replacing the ith row of A by the jth row of A. Since A'
has two identical rows, A' = 0. Thus by the above result,
A' = ajiAji + aj2^i2 + • • • + Ojn^ln =
Using A* — \A\, we also obtain that Oti^ii + «2j^2i + •" • + «niAni = 0.
8.25. Prove Theorem 8.8: A(adjA) = (adjA)A = A/. Thus if AI ^ 0, A* =
(l/Al)(ad3A).
Let A = (ay) and let A • (adj A) = (fty). The ith row of A is
(«ii. ffliz. • • • . <*»tl) '^)
Since adj A is the transpose of the matrix of cofactors, the ith column of adj A is the transpose of
the cofactors of the jth row of A:
(A,i,Aj2, ...,A,„)« (2)
Now by, the ijentry in A • (adj A), is obtained by multiplying (1) and (2):
6y = a^Aji + Uj^Ajz + • • • + ttin'Ajn
Thus by Theorem 8.7 and the preceding problem,
" \ if i ^ i
Accordingly, A (adj A) is the diagonal matrix with each diagonal element \A\. In other words,
A • (adj A) = \A] I. Similarly, (adj A)A = \A\ 1.
CHAP. 8] DETERMINANTS 187
8.26. Prove Theorem 8.9: The system of linear equations Ax = b has a unique solution
if and only if A = ]A ^ 0. In this case the unique solution is given by Xi = Ai/A,
X2 = A2/A, . . . , Xn = An/A.
By preceding results, Ax = b has a unique solution if and only if A is invertible, and A is
invertible if and only if A = A # 0.
Now suppose A#0. By Problem 8.25, Ai = (1/A)(adj A). Multiplying Ax = b by A 1, we
obtain
X = A^Ax = (1/A)(adj A)b (1)
Notethattheithrowof (l/A)(adjA) is (l/A)(Aii, A^, . . ., A„j). If 6 = (61, 63, • • ., &„)' then, by (i),
Xi = (1/A)(6i Aii + b^A^i +■■■+ 6„A J
However, as in Problem 8.24;
6iAii + 62^21 + • • • + 6„A„j = A;
the determinant of the matrix obtained by replacing the ith column of A by the column vector 6.
Thus Xi = (l/A)Aj, as required.
8.27. Suppose P is invertible. Show that Pi] = Pi.
P^P = I. Hence 1 = / = pip = pi p, and so Pi = Pi.
8.28. Prove Theorem 8.11: Suppose A and S are similar matrices. Then A = B,
Since A and B are similar, there exists an invertible matrix P such that B = P^AP. Then
by the preceding problem, P = PiAP = Pi \A\ \P\ = \A\ lPi \P\  \A\ .
We remark that although the matrices Pi and A may not commute, their determinants Pi
and \A\ do commute since they are scalars in the field K.
8.29. Prove Theorem 8.13: There exists a unique function D.cA^K such that (i) D is
multilinear, (ii) D is alternating, (iii) D(/) = 1. This function D is the determinant
function, i.e. D{A) = A.
Let D be the determinant function: D(A) = \A\. We must show that D satisfies (i), (ii) and (iii),
and that D is the only function satisfying (1), (ii) and (iii).
By preceding results, D satisfies (ii) and (iii); hence we need show that it is multilinear. Suppose
■A = (««) = (Ai, Ag, . . ., A„) where A^ is the fcth row of A. Furthermore, suppose for a fixed i,
Aj = Bi + Cj, where Bi = (b^, . . . , 6„) and Q = (ci, . . . , c„)
Accordingly, a^ = 61 + Cj, ajj =63 + 02, ..., «;„ = 6„ + c„
Expanding D(A)  \A\ by the ith row,
D(A) = D(A„ . . ., Bi + Ci, . . ., A„) = aaA^, + at^A^^ + ■■■ + Ui^A^,
= (61 + ci)Aii + (62 + C2)A;2 + . . . + (6„ + c„)Ai„
= (fciAii + 62Ai2 + • • • + 6„Ai„) + (ciA(i + C2A,.2 + • ■ • + c^Ai^)
However, by Problem 8.24, the two sums above are the determinants of the matrices obtained from
A by replacing the ith row by Bj and Q respectively. That is,
D(A) = I>(A„...,Bi + Ci, ...,A„)
= I>(Ai, ...,Bi, ...,A„) + Z)(Ai, ...,Ci, ...,A„)
Furthermore, by Theorem 8.3(i),
Z)(Ai, ...,fcAj A„) = fcD(A„...,Ai, ...,A„)
Thus B is multilinear, i.e. D satisfies (iii).
188 DETERMINANTS [CHAP. 8
We next must prove the uniqueness of D. Suppose D satisfies (i), (ii) and (iii). If {e^, . . . , e„}
is the usual basis of K", then by (iii), Z>(ei, eg, . . ., e„) = D{I) = 1. Using (ii) we also have (Problem
8.73) that
Z)(ejj,ei2. •■■'%) = sgna, where a = in^ . . . i^ (D
Now suppose A = (ay). Observe that the fcth row A^^ of A is
^k = («ki. «fc2. ■■■> "fcn) = «fci«i + "k2«2 + ■ • • + afc„e„
Thus I'(A) = I>(aiiei + • • • + ai„e„, 02161 + • • • + a2„e„, . . . , o„iei + • • ■ + a„„e„)
Using the multilinearity of D, we can write D(A) as a sum of terms of the form
= 2 ("Hj a2i2 • • • '^"iJ ^'•\' %• ■■■' K^
where the sum is summed over all sequences 11^2 • ■ ■ in where i^ G {1, . . . , n}. If two of the indices
are equal, say ij = i^ but j ¥' k, then by (ii),
^(%. %, ■•■'\) =
Accordingly, the sum in (2) need only be summed over all permutations a = i^i^ ■ ■ . ■J„. Using (1),
we finally have that
D{A) = 2 («iii "212 •• • «ni„) D{ei^, Bj^, . . . , ejj
= 2 (sgn a) a„^ a2i^ . . . a„i^, where a  i^i2 ...in
Hence D is the determinant function and so the theorem is proved.
PERMUTATIONS
8.30. Determine the parity of a = 542163.
Method 1.
We need to obtain the number of pairs (i, j) for which i> j and i precedes 3 in a. There are:
3 numbers (5, 4 and 2) greater than and preceding 1,
2 numbers (5 and 4) greater than and preceding 2,
3 numbers (5, 4 and 6) greater than and preceding 3,
1 number (5) greater than and preceding 4,
numbers greater than and preceding 5,
numbers greater than and preceding 6.
Since 3 + 2+3 + 1 + + = 9 is odd, a is an odd permutation and so sgn a = 1.
Method 2.
Transpose 1 to the first position as follows:
^42163 to 154263
Transpose 2 to the second position:
154263 to 125463
Transpose 3 to the third position:
125463 to 123546
Transpose 4 to the fourth position:
1 2 sT^ 6 to 123456
Note that 5 and 6 are in the "correct" positions. Count the number of numbers "jumped":
3 + 2 + 3 + 1 = 9. Since 9 is odd, a is an odd permutation. (Remark: This method is essentially
the same as the preceding method.)
CHAP. 8] DETERMINANTS 189
Method 3.
An interchange of two. numbers in a permutation is equivalent to multiplying the permutation
by a transposition. Hence transform a to the identity permutation using transpositions; such as.
5 4 2^1 6 3
X
14 2 5 6 3
X
1 2 4 5 6/3
1 2 S^ 6^4
X
12 3 4 6 5
X
12 3 4 5 6
Since an odd number, 5, of transpositions was used, a is an odd permutation.
8.31. Let a = 24513 and t = 41352 be permutations in Ss. Find (i) the composition per
mutations to„ and oroT, (ii) a~^.
Recall that a = 24513 and t = 41352 are short ways of writing
_ /I 2 3 4 5\
""(^24513/
and
T
_ /I 2 3 4 5N
~ V4 1 3 5 2/
which means
<,(1) = 2, <t(2) = 4, <r(3)
= 5,
(4)
= 1 and <r(5) = 3
and
r(l) = 4, r(2) = 1, r(3)
= 3, T
(4)
= 5 and t(5) = 2
(i)
12 3 4 5
12 3 4 5
» 1 4, J, j i
^/ ^/ ^^ ^^ ^r
2 4 5 13
and
4 13 5 2
T j 4 1 1 1
.. i i i i i
15 2 4 3
12 5 3 4
Thus to<t =
15243 and a°T = 12534.
(ii)
Vl 2 3
1 3\
4 5/
=
/I 2 3 4 5\
V4 1 5 2 3/
That is, ffi
= 41523.
8.32. Consider any permutation <7 = jiji . . . jn. Show that for each pair (i, k) such that
i> k and i precedes kin a
there is a pair (i*, k*) such that
i* < k* and cr(i*) > <r(fc*) (i)
and vice versa. Thus cr is even or odd according as to whether there is an even or
odd number of pairs satisfying (1).
Choose i* and A;* so that a(i*) = i and a{k*) = fc. Then i > fe if and only if <r(i*) > a(fe*),
and i precedes k in a it and only if i* < k*.
190 DETERMINANTS [CHAP. 8
8.33. Consider the polynomial g  g{xi, ...,Xn) = y[{XiXj). Write out explicitly the
polynomial g — g{xi, X2, Xs, xt). '"^^
The symbol 11 is used for a product of terms in the same way that the symbol 2 is used for a
sum of terms. That is, Yl (a'i ~ ""j) means the product of all terms (Kj — Xj) for which i < j. Hence
i<i
9  g{Xl,...,Xi) = iXi — X2){Xi — Xa)(Xi — Xi)(X2X3)(X2Xi)(X3 — X^)
8.34. Let (T be an arbitrary permutation. For the polynomial g in the preceding problem,
define <T{g) = Yl (^^c" ~ ^(tw) Show that
'i<3
{ g if o is even
[—g it a IS odd
Accordingly, aig) = (sgn <j)g.
Since a is oneone and onto,
a{g) = n (a'.ra)  a'.ru)) = . .11. ,{xiXj)
to" KjorOj
Thus a{g) — g or a(g) = —g according as to whether there is an even or an odd number of terms
of the form («;  Xj) where i > j. Note that for each pair (i, j) for which
i < } and 17(1) > <r{j) (1)
there is a term (ajofj,  x^^j^) in a(g) for which a(i) > a(j). Since a is even if and only if there is an
even number of pairs satisfying (1), we have a(g) = g it and only if a is even; hence <r{g) = g
if and only if a is odd.
8.35. Let u,tG Sn. Show that sgn (to a) = (sgn T)(sgn <t). Thus the product of two even
or two odd permutations is even, and the product of an odd and an even permutation
is odd.
Using the preceding problem, we have
sgXi{r °a)g = {T°c)(g) = T(CT(flr)) = T((sgn <r)sr) = (sgn T)(sgn <T)sf
Accordingly, sgn(TO<r) = (sgn T)(sgn <r).
8.36. Consider the permutation a = J1J2 . . . jn. Show that sgn (71 = sgn <t and, for scalars
" ttj^i aj^2 . . . ttj^n = aik^chk^ ■ • ■ Omk^ where <7~^ = kiki . . . kn
We have a^°<T = c, the identity permutation. Since e is even, cri and a are both even or
both odd. Hence sgn <ri = sgn a.
Since a  JiJz ■■■in is a permutation, aj^i Oj^a • • • a,j^n = "iki "■zkz ■ ■ ■ »nk„ Then fej, k^, ■..,k„
have the property that
ff(fci) = 1, ^(fea) = 2, . . ., "(K) = n
Let T = ^1^2 ■ ■ ■ kn Then for i = 1, . . . , w,
(<j°T)(i) = a{T(i)) = a(fcj) = i
Thus a°T — e, the identity permutation; hence t = ct~i.
CHAP. 8]
DETERMINANTS
191
MISCELLANEOUS PROBLEMS
8.37. Find det (T) for each linear operator T:
(i) T is the operator on R^ defined by
T{x, y, z) — {2x — z,x + 2y 4:Z, Sx3y + z)
(ii) T is the operator on the vector space V of 2square matrices over K defined by
c d,
1 2 01'
(i) Find the matrix representation of T relative to, say, the usual basis: [r] = I 1 2—4
T(A) = MA where M
1 3 3 1/
Then
det (T) =
2 01
124
33 1
2(212)  l(36) =
(ii) Find a matrix representation of T in some basis of V, say,
Then
T(Ei) =
T(E^) =
nE^) =
T{Ei) =
1
E. =
1
£, =
E^ =
11
a b\/l
c dj\0
a 6\/0 1
c dj\0
a b\/0
c d)\i
a b\/0
e djio 1
a
c
a
c
'b
^d
b
d
^1 0/' "■* VO 1,
= aEi + 0^2 + cE^ + OE4
= 0iS7i + aE^ + OS3 + cE^
= bEi + OE2 + dEs + OEi
= 0^1 + bE2 + OE3 + dEi
Thus [T]e =
det (T) =
'a c 0\
a c
b d
^0 b dj
a c
a c
6 d
6 d
and
a c
a e
= a
d
+ c
6
b d
b d
 a^d^ + bH"^
2abcd
/111
8.38. Find the inverse of A = ( 1 1
.0 1,
The inverse of A is of the form (Problem 8.53): A» = 1 z
Set AA'^ — l, the identity matrix: * '
Ai4i =
Set corresponding entries equal to each other to obtain the system
a;l = 0, 2/tzl = 0, z + l =
= /
192 DETERMINANTS [CHAP. 8
/ll 0\
The solution of the system is a; = —1, y = 0, z = —1. Hence A ^ — \ 1 — 1 .
\0 1/
A~^ could also be found by the formula A> = (adj i4.)/A .
8.39. Let D be a 2linear, alternating function. Show that D{A, B) =  D{B, A). More
generally, show that if D is multilinear and alternating, then
D{...,A, ...,B, ...) = D{...,B, ...,A, ...)
that is, the sign is changed whenever two components are interchanged.
Since D is alternating, D{A + B,A+B) = 0. Furthermore, since D is multilinear,
= D{A+B,A+B) = D(A,A + B) + D(B,A + B)
= D{A, A) + D(A, B) + D(B, A) + D{B, B)
But D(A, A) = and I>(B, B) = 0. Hence
= D(A, B) + D(B, A) or D{A, B) =  D(B, A)
Similarly, = D(. . ., A + B, . . ., A+ B, . . .)
= D(,...,A A,...) + D(....,A, ...,B, ...)
+ D(...,B,...,A, ...) + D{...,B....,B, ...)
= D(...,A, ...,B, ...) + D(...,B, ...,A, ...)
and thus D(...,A B,...) = D(...,B A,...).
8.40. Let V be the vector space of 2 by 2 matrices M = ( , j over R. Determine
whether or not D.V^R is 2linear (with respect to the rows) if (i) D{M) = a + d,
(ii) D{M) = ad.
(i) No. For example, suppose A = (1, 1) and B  (3, 3). Then
D(A,B) = d(\ 3) = 4 and D(2A.B)  C^^ 3) = 5 # 2D(A, B)
(ii) Yes. Let A = (tti, a^, B = (61, 63) and C = (cj, Ca); then
D(A,C) = dT' "^) = 01C2 and D(B,C) = ■d(^' ^') = V2
Hence for any scalars s, < G R,
D(sA + tB,C) = i?/«»i + **^ sa,+ tb,^^ ^ ^sa, + th,)c,
= s(aiC2) + t(6iC2) = 8D(A, C) + tD{B, C)
That is, D is linear with respect to the first row.
Furthermore,
D(C,A) = d(^; I'J = c,a, and D(C, B) = D^^^ ^ = c,b.
Hence for any scalars s, t S R,
D(C,sA + tB) ^ d( "J., „?..)= Ci(sa2+t62)
= s(cia2) + t(ci62) = sI>(C,A) + «D(C,B)
That is, D is linear with respect to the second row.
Both linearity conditions imply that D is 2linear.
CHAP. 8]
DETERMINANTS
193
Supplementary Problems
COMPUTATION OF DETERMINANTS
8.41. Evaluate the determinant of each matrix: (i)
8.42. Compute the determinant of each matrix: (i)
2 5
4 1
t2
(ii)
6 1
3 2
3 \ .... /t5 7
4 tl)' <"> 1 ( + 3
8.43. For each matrix in the preceding problem, find those values of t for which the determinant is zero.
8.44. Compute the determinant of each matrix:
/2 1 l\ /3 2 4\
(i) 5 2 , (ii) 2 5 1 , (iii)
\1 3 4/ \0 6 1/
8.45. Evaluate the determinant of each matrix:
(i)
/2 1 4
f  1 3 3
(ii) I 3 t + 5 3
6 6 t4,1
8.46. For each matrix in the preceding problem, determine those values of t for which the determinant
is zero.
8.47. Evaluate the determinant of each matrix: (i)
/l 2 2 3
1020
\4 3 2)
COF ACTORS, CLASSICAL ADJOINTS, INVERSES
'l 22 3^
8.48. For the matrix I „ „ I, find the cof actor of:
4 2 1
,1 7 2 3^
(i) the entry 4, (ii) the entry 5, (iii) the entry 7.
8.49. Let A
Find (i) adj A, (ii) Ai.
2 1 3 2\
3 12
31 121' ^"^ I 1 1 4 3
l2 2 1 1/
'1 2 2'
8.50. Let A = f 3 1
a 1 1,
Find (i) adj A, (ii) Ai.
8.51. Find the classical adjoint of each matrix in Problem 8.47.
8.52. Determine the general 2 by 2 matrix A for which A = adj A.
8.53. Suppose A is diagonal and B is triangular; say,
' di ...
02 ...
A =
and B
,0
62
lO
K j
194 DETERMINANTS [CHAP. 8
(i) Show that adj A is diagonal and adj B is triangular.
(ii) Show that B is invertible iff all 6; ¥= 0; hence A is invertible iff all «{ ^ 0.
(iii) Show that the inverses of A and B (if either exists) are of the form
ti ... \ /&r* di2 ■•• dm
62 ... d2n
A1 = 0 ar^ ••■ I 5_, ^
lO ... a '
\o
That is, the diagonal elements of Ai and B^ are the inverses of the corresponding diagonal
elements of A and B.
DETERMINANT OF A LINEAR OPERATOR
8.54. Let T be the linear operator on R^ defined by
T{x, y, z) = (3a;  2z, 5y + 7z,x + y + z)
Find det (T).
8.55. Let DiV^V be the differential operator, i.e. D{v) = dv/dt. Find det (D) if V is the space gen
erated by (i) {l,t, .... t"}, (ii) {e*, e^*, e^t}, (iii) {sin t, cos t}.
8.56. Prove Theorem 8.12: Let T and S be linear operators on V. Then:
(i) det (S°T) = det(S)'det(r); (ii) T is invertible if and only if det (7) 9^ 0.
8.57. Show that: (i)det(lv) = l where ly is the identity operator; (ii) det (Ti) = det(r)i if T is
invertible.
DETERMINANTS AND LINEAR EQUATIONS
,.. (Sx + 5y = 8 ,.., f2xSy = 1
8.58. Solve by determinants: (1) i , „ , (") i . , „ . •
[ix — 2y = l I4x + 7y = —1
(2x5y + 2z = 7 (2z + 3 = y + Sx
8.59. Solve by determinants: (i) } x + 2y  Az = 3 (ii) ■< x  Sz = 2y + 1.
[sxiyGz = 5 [sy + z = 2  2x
8.60. Prove Theorem 8.10: The homogeneous system Ax = has a nonzero solution if and only if
A = IA == 0.
PERMUTATIONS
8.61. Determine the parity of these permutations in Sg: (i) a = 3 2 1 5 4, (ii) r = 1 3 5 2 4, (iii) y = 4 2 5 3 1.
8.62. For the permutations <r, t and v in Problem 8.61, find (i) r°c, (ii) Tr°<r, (iii) cri, (iv) ti.
8.63. Let T e 5„. Show that t°<t runs through S„ as a runs through S„; that is, S„ = {t « a : a e iSf„}.
8.64. Let aeS„ have the property that <t(w) = n. Let a* e S„_i be defined by <f*(x) = <r(x). (i) Show
that sgn <r* = sgn a. (ii) Show that as a runs through S„, where cr{n) = n, c* runs through
S„_i; that is, S„_i = {a* : <r e S„, <T(m) = n}.
MULTILINEARITY
8.65. Let V = (K"")"", i.e. V is the space of msquare matrices viewed as mtuples of row vectors. Let
D:V^K.
(i) Show that the following weaker statement is equivalent to D being alternating:
i?(Ai,A2, ...,A„) =
whenever Aj = Ai+i for some i.
(ii) Suppose D is mlinear and alternating. Show that if A^.Az, ■■■,A^ are linearly dependent,
then D(Ai,...,AJ = 0.
CHAP. 8] DETERMINANTS 195
8.66. Let V be the space of 2 by 2 matrices M = ( j over R. Determine whether or not
D:V>^K is 2linear (with respect to the rows) if (i) D{M) = achd, (ii) D{M) = ah ed,
(iii) D{M) = 0, (iv) D{M) = 1.
8.67. Let V be the space of Msquare matrices over K. Suppose B B V is invertible and so det (B) ¥= 0.
Define Z> : V * X by Z?(A) = det (Afi)/det (B) where A G V. Hence
fl(Ai, Ag, . . . , A„) = det (AiB, A^B, .... A„B)/det (B)
where Aj is the tth row of A and so A^B is the ith row of AB. Show that Z> is multilinear and
alternating, and that !>(/) = 1. (Thus by Theorem 8.13, D{A)  det (A) and so det (AB) =
det (A) det (B). This method is used by some texts to prove Theorem 8.5, i.e. Ai5 = \A\ B.)
MISCELLANEOUS PROBLEMS
8.68. Let A be an wsquare matrix. Prove IfeAl = fc" \A\ .
8.69. Prove:
1 x^ x\ ... x\~^
'2 ■ ■ ■ X^
1 X2 xf ~"l
1 a:„ xl ... «»»
The above is called the Vandermonde determinant of order n.
8.70. Consider the block matrix M = f j where A and C are square matrices. Prove \M\ = \A\ \C\.
More generally, prove that if M is a triangular block matrix with square matrices Aj, . . ., A^ on
the diagonal, then \M\ = Ai [Agl • • "l^ml
8.71. Let A, B, C and D be commuting msquare matrices. Consider the 2»square block matrix
^ " (c d) P'^^^^tl'** W = \A\\D\\B\\C\.
8.72. Suppose A is ortfeoflroMttZ, that is, A^A = I. Show that A = ±1.
8.73. Consider a permutation a = Jiij • • • in Let {e<} be the usual basis of X», and let A be the matrix
whose tth row is e^., i.e. A = (e,^, e^^, . . ., e^^). Show that A = sgn a.
8.74. Let A be an Msquare matrix. The determinantal rank of A is the order of the largest submatrix
of A (obtained by deleting rows and columns of A) whose determinant is not zero. Show that the
determinantal rank of A is equal to its rank, i.e. the maximum number of linearly independent rows
(or columns).
Answers to Supplementary Problems
8.41. (i) 18, (ii) 15.
8.42. (i) t2  3t  10, (ii) t^  2t  8.
8.43. (i) t = 5, t 2; (ii) t  i, t  2.
8.44. (i) 21, (ii) 11, (iii) 100, (iv) 0.
196 DETERMINANTS [CHAP. 8
8.45. (i) (< + 2)(t3)(t4), (li) (t + 2)2(t4), (iii) (t + 2)2(f4).
8.46. (i) 3, 4, 2; (ii) 4, 2; (iii) 4, 2.
8.47. (i) 131, (ii) 55.
8.48. (i) 135, (ii) 103, (iii) 31.
8.49. adj A = 1 1 1 , Ai = (adj A)/A1 = i ^ 
\ 2 2 0/ \l 1 0,
1 2\ /I 2\
8.50. adj A =  3 1 6 , Ai =31
2 1 5/ \2 1 5/
(16 29 26 2\ / 21 14 17 19^
30 38 16 29 1 ,... I 44 H 33 11
8 51 13 1 (") 29 1 13 21
13 1 28 18/ \ 17 7 19 18;
/k
8.52. A = ( „ ,
8.54. det(r) = 4.
8.55. (i) 0, (ii) 6, (iii) 1.
8.58. (i) X = 21/26, y = 29/26; (ii) x = 5/13, y = 1/13.
8.59. (i) X = 5, y — 1, z = 1. (ii) Since A = 0, the system cannot be solved by determinants.
8.61. agn a = 1, agn t = —1, sgn v = —1.
8.62. (i) T°v = 53142, (ii) ir°(r = 52413, (iii) <ri = 32154, (iv) ti = 14253.
8.66. (i) Yes, (ii) No, (iii) Yes, (iv) No.
chapter 9
Eigenvalues and Eigenvectors
INTRODUCTION
In this chapter we investigate the theory of a single linear operator T on a vector
space V of finite dimension. In particular, we find conditions under which T is diago
nalizable. As was seen in Chapter 7, this question is closely related to the theory of
similarity transformations for matrices.
We shall also associate certain polynomials with an operator T: its characteristic
polynomial and its minimum polynomial. These polynomials and their roots play a major
role in the investigation of T. We comment that the particular field K also plays an im
portant part in the theory since the existence of roots of a polynomial depends on K.
POLYNOMIALS OF MATRICES AND LINEAR OPERATORS
Consider a polynomial f{t) over a field K: f(t) = Ont" + •  • + ait + Oo. If A is a square
matrix over K, then we define
/(A) = a„A" + • • • + ttiA + ao/
where / is the identity matrix. In particular, we say that A is a root or zero of the poly
nomial fit) if /(A) = 0.
Example 9.1: Let A = ( V and let /(t) = 2t2  3t + 7, g(t) = t^  5t  2. Then
^<l
«^)i:!r<3:)a;) = a^;::
1 2V /I 2\ „/l 0\ /O
and .(A) = ^3 J 5^3 ^^2^^ ^^ ^^ ^
Thus A is a zero of g(t).
The following theorem applies.
Theorem 9.1: Let / and g be polynomials over K, and let A be an wsquare matrix over K.
Then
(i) (/ + flr)(A) = /(A)+flr(A)
(ii) {fg)(A) = f(A)giA)
and, for any scalar k G K,
(iii) {kf){A) = kf{A)
Furthermore, since f{t) g{t) = g{t) f{t) for any polynomials f{t) and g{t),
f{A)g{A) = giA)f{A)
That is, any two polynomials in the matrix A commute.
197
198
EIGENVALUES AND EIGENVECTORS
[CHAP. 9
Now suppose T :V ^ V is a linear operator on a vector space V over K. If f{t) —
cut" + ■ • • + ait + do, then we define f{T) in the same way as we did for matrices:
f{T) = a^T" + ■•■ +aiT + aoI
where / is now the identity mapping. We also say that T is a zero or root of f(t) if f{T) = 0.
We remark that the relations in Theorem 9.1 hold for operators as they do for matrices;
hence any two polynomials in T commute.
Furthermore, if A is a matrix representation of T, then /(A) is the matrix representation
of f(T). In particular, f{T) = if and only if /(A) = 0.
EIGENVALUES AND EIGENVECTORS
Let T .V *V be a linear operator on a vector space V over a field K. A scalar X G K
is called an eigenvalue of T if there exists a nonzero vector v eV for which
T{v) = \v
Every vector satisfying this relation is then called an eigenvector of T belonging to the
eigenvalue A. Note that each scalar multiple kv is such an eigenvector:
T{kv) = kT{v) = k(\v) = \{kv)
The set of all such vectors is a subspace of V (Problem 9.6) called the eigenspace of \.
The terms characteristic value and characteristic vector (or: proper value and proper
vector) are frequently used instead of eigenvalue and eigenvector.
Example 9.2: Let I.V^V be the identity mapping. Then, for every vGV, I{v) = v = Iv.
Hence 1 is an eigenvalue of /, and every vector in V is an eigenvector belonging to 1.
Example 9.3 : Let 7" : R^ ^ R2 be the linear operator which
rotates each vector v GB? by an angle
= 90°. Note that no nonzero vector is a
multiple of itself. Hence T has no eigen
values and so no eigenvectors.
Example 9.4: Let D be the differential operator on the vector
space V of diflferentiable functions. We have
£)(e*') = 5e^*. Hence 5 is an eigenvalue of D
with eigenvector e^'.
If A is an wsquare matrix over K, then an eigenvalue of A means an eigenvalue of A
viewed as an operator on K". That is, \gK is an eigenvalue of A if, for some nonzero
(column) vector v G X",
Av = \v
In this case v is an eigenvector of A belonging to A.
such that AX = tX:
Example 9.5: Find eigenvalues and associated nonzero eigenvectors of the matrix A =
We seek a scalar t and a nonzero vector X = ( \
i DO = •(:
The above matrix equation is equivalent to the homogeneous system
X + 2y = tx ( {tl)x 2y =
, or s
[3x + 2y = ty \Zx + (t2)y =
1 2
3 2
(i)
CHAP. 9] EIGENVALUES AND EIGENVECTORS 199
Recall that the homogeneous system has a nonzero solution if and only if the de
terminant of the matrix of coefficients is 0:
t1 2
3 « 2
t2  3t  4 = {t4){t+l) =
Thus t is an eigenvalue of A if and only if t = 4 or t = —1.
Setting t = 4 in (1),
3x ~ 2y = . ,
or simply 3x — 2y =
Sx + 2y =
Thus V = { ) = ( „ ) is a nonzero eigenvector belonging to the eigenvalue t = 4,
\V/ \3/
and every other eigenvector belonging to < = 4 is a multiple of v.
Setting t = l in (1),
2x  2y = . ,
or simply x + y =
3x — 3y =
Thus w = I ) — { 1 ) is a nonzero eigenvector belonging to the eigenvalue
t = —1, and every other eigenvector belonging to t = —1 is a multiple of w.
The next theorem gives an important characterization of eigenvalues which is fre
quently used as its definition.
Theorem 9.2: Let T:V ^V be a linear operator on a vector space over K. Then XGK
is an eigenvalue of T if and only if the operator Xl — T is singular. The
eigenspace of A is then the kernel of XI — T.
Proof. X is an eigenvalue of T if and only if there exists a nonzero vector v such that
T{v) = XV or {Xl){v)T{v) = or {XlT){v) =
i.e. Xl — T is singular. We also have that v is in the eigenspace of X if and only if the above
relations hold; hence v is in the kernel of XI — T.
We now sta+2 a very useful theorem which we prove (Problem 9.14) by induction:
Theorem 9.3: Nonzero eigenvectors belonging to distinct eigenvalues are linearly
independent.
Example 9.6: Consider the functions e°i', 6°^', . . ., e''n' where aj, . . .,a„ are distinct real numbers.
If D is the differential operator then D(e°i'') = a^e'^''*. Accordingly, e^i', ..., e°»'
are eigenvectors of D belonging to the distinct eigenvalues ai, . . . , a„, and so, by
Theorem 9.3, are linearly independent.
We remark that independent eigenvectors can belong to the same eigenvalue (see
Problem 9.7).
DIAGONALIZATION AND EIGENVECTORS
Let T:V ^ V be a linear operator on a vector space V with finite dimension n. Note
that T can be represented by a diagonal matrix
'A;i ...
k2 ...
,0 ... kni
200 EIGENVALUES AND EIGENVECTORS [CHAP. 9
if and only if there exists a basis [vi, . . .,v„} of V for which
T{vi) = kivi
T{v2) — kiVi
T{V„)  knVn
that is, such that the vectors vi, ■ ■ .,Vn are eigenvectors of T belonging respectively to eigen
values ki, . . ., fen. In other words:
Theorem 9.4: A linear operator T : V » V can be represented by a diagonal matrix B
if and only if V has a basis consisting of eigenvectors of T. In this case
the diagonal elements of B are the corresponding eigenvalues.
We have the following equivalent statement.
Alternate Form of Theorem 9.4: An wsquare matrix A is similar to a diagonal matrix
B if and only if A has n linearly independent eigen
vectors. In this case the diagonal elements of B are the
corresponding eigenvalues.
In the above theorem, if we let P be the matrix whose columns are the n independent
eigenvectors of A, then B = P~^AP.
Example 9.7: Consider the matrix A = ( j . By Example 9.5, A has two independent
/2\ / 1\ /2 1\ „_, /1/5 1/5
eigenvectors (^^j and (^_J. Set P = {^^ _^j , and so P i  (^^^^ _^^^
Then A is similar to the diagonal matrix
B = P^AP =
/1/5 l/5\/l 2\/2 1\ ^ /4 0\
1^3/5 2/5/^3 2/1^3 1/ ~ \0 l)
As expected, the diagonal elements 4 and —1 of the diagonal matrix B are the eigen
values corresponding to the given eigenvectors.
CHARACTERISTIC POLYNOMIAL, CAYLEYHAMILTON THEOREM
Consider an %square matrix A over a field K:
(0,11 ai2 . . . ain
Chl ffl22 . . . fflZn
ttnl ttre2 . • ■ flnn
The matrix tin — A, where /„ is the nsquare identity matrix and * is an indeterminant, is
called the characteristic matrix of A:
(t — an — ffli2 . . . —ain
~tt21 t — a22 . . . —a2n
— ftnl — ttn2 ... t aim
Its determinant AA{t) — det {tin — A)
which is a polynomial in t, is called the characteristic polynomial of A. We also call
AA{t) = det (tin A) =
the characteristic equation of A.
CHAP. 9] EIGENVALUES AND EIGENVECTORS 201
Now each term in the determinant contains one and only one entry from each row and
from each column; hence the above characteristic polynomial is of the form
AA{t)  {t an){t  022) ••■(* ann)
+ terms with at most n — 2 factors of the form t — an
Accordingly,
AaC*) = t" — (au + a22+ • • • +aTO)t"~^ + terms of lower degree
Recall that the trace of A is the sum of its diagonal elements. Thus the characteristic
polynomial Aa(*) = det (i/„ — A) of A is a monic polynomial of degree n, and the coefficient
of i"~^ is the negative of the trace of A. (A polynomial is monic if its leading coefficient is 1.)
Furthermore, if we set i = in Aa(<), we obtain
Aa(0) = \A\ = (1)"[A
But Aa(0) is the constant term of the polynomial AaC*). Thus the constant term of the char
acteristic polynomial of the matrix A is (—1)" \A\ where n is the order of A.
13 0^
Example 9.8: The characteristic polynomial of the matrix A = ( — 2 2—1
4 21
^(t) = \tIA\ =
*  1 3
2 t2 1
4 t + 2
= «3  <2 + 2« + 28
As expected, A(t) is a monic polynomial of degree 3.
We now state one of the most important theorems in linear algebra.
CayleyHamilton Theorem 9.5: Every matrix is a zero of its characteristic polynomial.
/I 2^
Example 9.9: The characteristic polynomial of the matrix A = I
V o Ji
A(t) = \tIA\ =
t 1 2
3 t2
= t2  3t
As expected from the CayleyHamilton theorem, A is a zero of A{t):
'<^'^G iJ'Q i)'Q °) = (: :)
The next theorem shows the intimate relationship between characteristic polynomials
and eigenvalues.
Theorem 9.6: Let A be an nsquare matrix over a field K. A scalar \GK is an eigen
value of A if and only if A is a root of the characteristic polynomial A(t) of A.
Proof. By Theorem 9.2, A is an eigenvalue of A if and only if XI — A is singular.
Furthermore, by Theorem 8.4, XI — A is singular if and only if a7 — A = 0, i.e. A is a root
of A(t). Thus the theorem is proved.
Using Theorems 9.3, 9.4 and 9.6, we obtain
Corollary 9.7: If the characteristic polynomial A(t) of an ^square matrix A is a product
of distinct linear factors:
202 EIGENVALUES AND EIGENVECTORS [CHAP. 9
A{t) = {t ai){t  aa) • • • (t  an)
i.e. if ai, . . . , On are distinct roots of A{t), then A is similar to a diagonal
matrix whose diagonal elements are the ok.
Furthermore, using the Fundamental Theorem of Algebra (every polynomial over C
has a root) and the above theorem, we obtain
Corollary 9.8: Let A be an wsquare matrix over the complex field C. Then A has at
least one eigenvalue.
Example 9.10: Let A = 2 — 5 . Its characteristic polynomial is
A(t) =
We consider two cases:
t  3
« 2 5
1 t + 2
= («3)(t2 + l)
(i) A is a matrix over the real field R. Then A has only the one eigenvalue 3.
Since 3 has only one independent eigenvector, A is not diagonalizable.
(ii) A is a matrix over the complex field C. Then A has three distinct eigenvalues:
3, i and —i. Thus there exists an invertible matrix P over the complex field C
for which
/3 0^
PiAP = i
\0 i,
i.e. A is diagonalizable.
Now suppose A and B are similar matrices, say B = P^^AP where P is invertible. We
show that A and B have the same characteristic polynomial. Using tl — P~^tIP,
\tIB\ = \tIP'AP\ = \PHIP  P'AP\
= \PmiA)P\ =■ \P^\\tIA\\P\
Since determinants are scalars and commute, and since P~i P = 1, we finally obtain
\tIB\ = \tIA\
Thus we have proved
Theorem 9.9: Similar matrices have the same characteristic polynomial.
MINIMUM POLYNOMIAL
Let A be an wsquare matrix over a field K. Observe that there are nonzero polynomials
f{t) for which f{A) — 0; for example, the characteristic polynomial of A. Among these
polynomials we consider those of lowest degree and from them we select one whose leading
coefficient is 1, i.e. which is monic. Such a polynomial m{t) exists and is unique (Problem
9.25); we call it the minimum polynomial of A.
Theorem 9.10: The minimum polynomial m{f) of A divides every polynomial which has A
as a zero. In particular, m{t) divides the characteristic polynomial A(t) of A.
There is an even stronger relationship between m{t) and A(i).
CHAP. 9]
EIGENVALUES AND EIGENVECTORS
203
2
1
2
2
5
Theorem 9.11: The characteristic and minimum polynomials of a matrix A have the same
irreducible factors.
This theorem does not say that w(i) = A(t); only that any irreducible factor of one must
divide the other. In particular, since a linear factor is irreducible, m(t) and A(t) have the
same linear factors; hence they have the same roots. Thus from Theorem 9.6 we obtain
Theorem 9.12: A scalar A is an eigenvalue for a matrix A if and only if A is a root of the
minimum poljTiomial of A.
Example 9.11: Find the minimum polynomial ■>«(*) of the matrix A
The characteristic polynomial of A is A(t) = t/ A] = (t 2)»(t 5). By
Theorem 9.11, both t — 2 and t — 5 must be factors of m(t). But by Theorem 9.10,
vnif) must divide A(t); hence m(i) must be one of the following three polynomials:
TOi(t) = (t2)(t5), W2(t) = (t2)2(t5), m^fy = (t2)3(t5)
We know from the CayleyHamilton theorem that in^iA) — A(A) = 0. The reader can
verify that »ni(A) # but WgCA) = 0. Accordingly, m^it) = {t — 2)^ (t — 5) is the
minimum polynomial of A.
Example 9.12: Let A be a 3 by 3 matrix over the real field R. We show that A cannot be a zero
of the polynomial f{t) = t^ + 1. By the CayleyHamilton theorem, A is a zero of
its characteristic poljmomial A(t). Note that A(t) is of degree 3; hence it has at least
one real root.
Now suppose A is a zero of f{t). Since f(t) is irreducible over R, f{t) must be
the minimal polynomial of A. But /(t) has no real root. This contradicts the fact
that the characteristic and minimal polynomials have the same roots. Thus A is not
a zero of f{t).
The reader can verify that the following 3 by 3 matrix over the complex
field C is a zero of f(t):
fo 1 0^
10
lO
CHARACTERISTIC AND MINIMUM POLYNOMIALS OF
LINEAR OPERATORS
Now suppose T:V^V is a linear operator on a vector space V with finite dimension.
We define the characteristic polynomial A(<) of T to be the characteristic polynomial of any
matrix representation of T. By Theorem 9.9, A{t) is independent of the particular basis in
which the matrix representation is computed. Note that the degree of A{t) is equal to the
dimension of V. We have theorems for T which are similar to the ones we had for matrices:
Theorem 9.5': T is a zero of its characteristic polynomial.
Theorem 9.6': The scalar \GK is an eigenvalue of T if and only if A is a root of the
characteristic polynomial of T.
The algebraic multiplicity of an eigenvalue \G K of T is defined to be the multiplicity
of A as a root of the characteristic polynomial of T. The geometric multiplicity of the
eigenvalue A is defined to be the dimension of its eigenspace.
Theorem 9.13: The geometric multiplicity of an eigenvalue A does not exceed its algebraic
multiplicity.
204 EIGENVALUES AND EIGENVECTORS [CHAP. 9
Example 9.13: Let V be the vector space of functions which has {sin e, cos 9} as a basis, and let D
be the differential operator on V. Then
D{sme) = cosfl = 0(sin e) + l(cos9)
Z>(cos e) = — sin 9 = — l(sin e) + O(cos e)
The matrix A of D in the above basis is therefore A = [D] = ( V Thus
det (tl A) =
t 1
1 t
= t2
and the characteristic polynomial of D is A(t) = t^ + l.
On the other hand, the minimum polynomial m{t) of the operator T is defined independ
ently of the theory of matrices, as the polynomial of lowest degree and leading coefficient 1
which has T as a zero. However, for any polynomial f{t),
f{T) = if and only if f{A) =
where A is any matrix representation of T. Accordingly, T and A have the same minimum
polynomial. We remark that all the theorems in this chapter on the minimum polynomial
of a matrix also hold for the minimum polynomial of the operator T.
Solved Problems
POLYNOMIALS OF MATRICES AND LINEAR OPERATORS
9.1. Find f{A) where A = T ~z\ and f{t) ^ fM + 1.
9.2. Show that ^ = ( « o ) ^^ ^ ^^^^ ^^ f(^) = *^  4t  5.
'«' = "— a;)'G :)<;:) = a:
9.3. Let V be the vector space of functions which has {sin 6, cos ^} as a basis, and let
D be the differential operator on V. Show that D is a zero of f{t) = t^ + 1.
Apply f(D) to each basis vector:
/(D)(sin9) = (Z>2 + 7)(sin e) = r»2(sin e) + /(sine) = sin « + sin 9 =
/(D)(cos e) = {D^ + /)(cos e) = DHcos e) + /(cos e) = cos 9 + cos e =
Since each basis vector is mapped into 0, every vector i; S V is also mapped into by f(D). Thus
fm = 0.
This result is expected since, by Example 9.13, /(<) is the characteristic polynomial of D.
CHAP. 9] EIGENVALUES AND EIGENVECTORS 205
9.4. Let A be a matrix representation of an operator T. Show that /(A) is the matrix
representation of f{T), for any polynomial f{t).
Let <t> be the mapping T h* A, i.e. which sends the operator T into its matrix representation A.
We need to prove that <i>(f(T))  f(A). Suppose fit) = a„t" \ • ■ ■ + a^t + a^. The proof is by in
duction on n, the degree of fit).
Suppose TO = 0. Recall that 0(/') = / where /' is the identity mapping and / is the identity
matrix. Thus
<t>{f(T)) = <t>(Hr) = ao0(/') = V = /(A)
and so the theorem holds for n = 0.
Now assume the theorem holds for polynomials of degree less than n. Then since ^ is an
algebra isomorphism,
^if(T)) = 0(tt„r" + a„_irni + • • • + aiT + a^')
= a„0(r) 0(r»i) + 0(a„_ir"i + • • • + aiT + a^')
= o„AA«i + (a„_jA"i + • • • + aiA + a<,/) = /(A)
and the theorem is proved.
9.5. Prove Theorem 9.1: Let / and g be polynomials over K. Let A be a square matrix
over K. Then: (i) (/ + fir)(A) = /(A) + flr(A); (ii) {fg){A) = /(A) g{A); and (iii) (fe/)(A) =
kf(A) where fc G K.
Suppose / = a„t" + • • • + Oi* + Oq and g = b^t^ + • • • + bit + bf,. Then by definition,
f(A) = a„A» + • • • + ttiA + Oq/ and ^(A) = ft^A" + • ■ • + biA + bol
(i) Suppose m — n and let ftj = if i> m. Then
/ + sr = (a„+6„)t"+ ••• +K + 6i)t + (ao+6o)
Hence
(/ + g)(A) = {a„ + 6„)A" +•••+(«! + 6i)A + (a<, + 6o)^
= a„A» + 6„A« + • • • + ajA + 6iA + tto^ + 60/ = /(A) + flr(A)
n + m
(ii) By definition, fg = c„ +„ t" +>"+•••+ Ci* + Co = 2 c^t'' where c^ = eio^fc + "'i^fci +
fc =
fc n + m
+ Ojcfro = 2 aj6fc_i. Hence (/ff)(A) = 2 CfcA^ and
1=0
/n N/™ \ nm n + m
/(A)f,(A) = ( 2 OiAMf 2 6jAi) = 2 2»M*+^ = 2 CfcA" = (/ff)(A)
\i=o /\j=o / t=0 j=0 fc=0
(iii) By definition, kf = A;a„t" + • • • + fcajt + fcoo. and so
(fe/)(A) = fca„A» + • • • + fettiA + fcoo/ = A;(a„A» + • • • + ajA + o,/) = k /(A)
EIGENVALUES AND EIGENVECTORS
9.6. Let A, be an eigenvalue of an operator T:V^V. Let Vx denote the set of all eigen
vectors of T belonging to the eigenvalue X (called the eigenspace of A.). Show that
Vx is a subspace of V.
Suppose v,w & Vx, that is, T(v) = \v and r(w) = \w. Then for any scalars a,b & K,
T(av + 6w) = a r(i;) + 6 T(w)  aiw) + b{\w)  Mav + bw)
Thus av + bw is an eigenvector belonging to X, i.e. av + bw G V^. Hence Vx is a subspace of V.
206
EIGENVALUES AND EIGENVECTORS
[CHAP. 9
9.7. Let A =
1 4\
2 g I . (i) Find all eigenvalues of A and the corresponding eigenvectors.
(ii) Find an invertible matrix P such that P^AP is diagonal.
U)
{t5){t+l)
The roots of A(t) are 5 and —1, and so these numbers are the eigenvalues of A.
We obtain the eigenvectors belonging to the eigenvalue 5. First substitute t = 5 into
4 4
5 form the solution of the homogeneous system determined by the above matrix, i.e.,
4 4\/x\ _ /0\ f 4x4y =
.2 2)\y) ~ \o) ""^ \^2x + 2y =
(i) Form the characteristic matrix */
AoiA:
\0 tj \2 3J \ 2 t3
The characteristic polynomial A(t) of A is its determinant:
A(t) = \tIA\ =
« 1 4
2 « 3
= «2 _ 4t _ 5 =
the characteristic matrix (1) to obtain the matrix
The eigenvectors belonging to
X — 2/ =
(In other words, the eigenvectors belonging to 5 form the kernel of the operator tl — A for
t = 5.) The above system has only one independent solution; for example, x = 1, y = 1. Thus
■" = (1, 1) is an eigenvector which generates the eigenspace of 5, i.e. every eigenvector belong
ing to 5 is a multiple of v.
We obtain the eigenvectors belonging to the eigenvalue —1. Substitute t = — 1 into {1)
to obtain the homogeneous system
2 4
2 4
~2x  4.y =
2x  4i/ =
The system has only one independent solution; for example, x
is an eigenvector which generates the eigenspace of —1.
or X + 2j/ =
2, 3/ = 1. Thus w = (2, 1)
(ii) Let P be the matrix whose columns are the above eigenvectors: P
B = P^AP =
■1 2
1 1
B — P~^AP is the diagonal matrix whose diagonal entries are the respective eigenvalues:
'1/3 2/3\/l 4\/l 2\ _ /5
1/3 1/3 JV 2 sjil 1/ " U 1
Then
{Remark. Here P is the transition matrix from the usual basis of R2 to the basis of eigen
vectors {v, w}. Hence B is the matrix representation of the operator A in this new basis.)
9.8. For each matrix, find all eigenvalues and a basis of each eigenspace:
1 3 3\ /3 1 l'
(i) A = I 3 5 3 , (ii) 5 = 7 51
66 4
Which matrix can be diagonalized, and why?
6 6 2
(1) Form the characteristic matrix tl — A and compute its determinant to obtain the character
istic polynomial A(t) of A:
«  1 3 3
3 t + 5 3 = (t + 2)2(t4)
6 6 f  4
A(t) = \tIA\ =
The roots of A(t) are —2 and 4; hence these numbers are the eigenvalues of A.
CHAP. 9]
EIGENVALUES AND EIGENVECTORS
207
We find a basis of the eigenspace of the eigenvalue —2. Substitute t = — 2 into the char
acteristic matrix tl — A to obtain the homogeneous system
fSa; + 32/  32 =
or i —3a; + 3j/ — 3z = or x — 2/ + z =
[—6a; + 62/ — 6z ==
The system has two independent solutions, e.g. a; = l, 2/ = l, z = and a; = 1, j/ = 0, z = —1.
Thus u = (1, 1, 0) and v = (1, 0, —1) are independent eigenvectors which generate the eigen
space of —2. That is, u and v form a basis of the eigenspace of —2. This means that every
eigenvector belonging to —2 is a linear combination of u and v.
We find a basis of the eigenspace of the eigenvalue 4. Substitute t = 4 into the char
acteristic matrix tl — A to obtain the homogeneous system
3x + By 
 3z =
Sx + 9y 
 3z =
6x + 6y
=
X + y — z =
2y  z =
The system has only one free variable; hence any particular nonzero solution, e.g. x = 1, y = 1,
z = 2, generates its solution space. Thus w = (1, 1, 2) is an eigenvector which generates, and
so forms a basis, of the eigenspace of 4.
Since A has three linearly independent eigenvectors, A is diagonalizable. In fact, let P
be the matrix whose columns are the three independent eigenvectors:
Then P^AP =
/2
\
As expected, the diagonal elements of P~^AP are the eigenvalues of A corresponding to the
columns of P.
(ii)
A(t) = \tIB\
t + 3 1
7 t5
6 6
1
1
t + 2
= (t + 2)2(t4)
The eigenvalues of B are therefore —2 and 4.
We find a basis of the eigenspace of the eigenvalue —2.
to obtain the homogeneous system
Substitute t = 2 into tl  B
X 
 y + z
=
Ix
ly + z
=
6a; 
6y
—
y + z =
y =0
The system has only one independent solution, e.g. a; = 1, 2/ = 1, z = 0. Thus u — (1, 1, 0)
forms a basis of the eigenspace of —2.
We find a basis of the eigenspace of the eigenvalue 4. Substitute t = 4 into tl —B to
obtain the homogeneous system
7a; — 2/ + z =
a; =
7a; 
2/ + 2
=
7a; 
2/ + z
=
6a;
62/ + 6Z
=
The system has only one independent solution, e.g. a; = 0, 2/ = 1, z = 1. Thus v
forms a basis of the eigenspace of 4.
(0.1,1)
Observe that B is not similar to a diagonal matrix since B has only two independent
eigenvectors. Furthermore, since A can be diagonalized but B cannot, A and B are not similar
matrices, even though they have the same characteristic polynomial.
208
EIGENVALUES AND EIGENVECTORS
[CHAP. 9
9.9. Let A =
(i)
Mt) = \tIA\ =
Find all eigenvalues and the corresponding
eigenvectors of A and B viewed as matrices over (i) the real field R, (ii) the complex
field C.
t 3 1
1 t1
Hence only 2 is an eigenvalue. Put t = 2 into tl — A and obtain the homogeneous system
= (2
4i
4 = (t  2)2
—X + y =
—X + y =
X — y
The system has only one independent solution, e.g. a; = 1, y = 1. Thus i) = (1, 1) is an eigen
vector which generates the eigenspace of 2, i.e. every eigenvector belonging to 2 is a multiple
of V.
We also have
^B(t) = \tlB\ =
t 1
2
1
t + 1
= t2 + 1
Since t2 + i has no solution in R, B has no eigenvalue as a matrix over R.
(ii) Since A^(t) = (t — 2)2 has only the real root 2, the results are the same as in (i). That is,
2 is an eigenvalue of A, and v = (1, 1) is an eigenvector which generates the eigenspace of 2,
i.e. every eigenvector of 2 is a (complex) multiple of v.
The characteristic matrix of B is Ajj(t) = \tl — B\ = t^ + i. Hence i and —i are the eigen
values of B.
We find the eigenvectors associated with t = i. Substitute t = i in tl — B to obtain the
homogeneous system
'i1 1 \/x\ _ /0\ / {il)x + y =
2 i+lAy) ~ [oj ""^ \2x + {i+l)y = °
The system has only one independent solution, e.g. x = 1, y — 1 
an eigenvector which generates the eigenspace of i.
Now substitute t = —i into tl — B to obtain the homogeneous system
'i — 1 1 \/x\ /0\ { (—i—l)x + y = Q
(i l)x + y =
Thus w = (l,l — i) is
2x + {i  l)y
(i l)x + y =
The system has only one Independent solution, e.g. x = 1, y = 1 + i. Thus w' = (1,11 i) is
an eigenvector which generates the eigenspace of —i.
9.10. Find all eigenvalues and a basis of each eigenspace of the operator T : R^ * R^ defined
by T{x, y, z) = {2x + y,y z, 2y + 4«).
First find a matrix representation of T, say relative to the usual basis of R^:
/2 1 0^
A = m =
The characteristic polynomial A(*) of T is then
A(t) = \tIA\ =
t2 1
t1
2
1
t 4
= (t2)2(t3)
Thus 2 and 3 are the eigenvalues of T.
We find a basis of the eigenspace of the eigenvalue 2. Substitute t = 2 into tl — A to obtain
the homogeneous system
/ ™ \ /a\ r «. — n
y =0
y + z =
CHAP. 9] EIGENVALUES AND EIGENVECTORS 209
The system has only one Independent solution, e.g. x = 1, y = 0, z = 0. Thus u = (1, 0, 0) forms
a basis of the eigenspace of 2.
We find a basis of the eigenspace of the eigenvalue 3. Substitute t — 3 into tl — A to obtain
the homogeneous system
The system has only one independent solution, e.g. x — 1, y — 1, z = —2. Thus v = (1, 1, —2)
forms a basis of the eigenspace of 3.
Observe that T is not diagonalizable, since T has only two linearly independent eigenvectors.
9.11. Show that is an eigenvalue of T if and only if T is singular.
We have that is an eigenvalue of T if and only if there exists a nonzero vector v such that
T{v) = Ov = 0, i.e. that T is singular.
9.12. Let A and B be wsquare matrices. Show that AB and BA have the same eigenvalues.
By Problem 9.11 and the fact that the product of nonsingular matrices is nonsingular, the fol
lowing statements are equivalent: (i) is an eigenvalue of AB, (ii) AB is singular, (iii) A or B is
singular, (iv) BA is singular, (v) is an eigenvalue of BA.
Now suppose X is a nonzero eigenvalue of AB. Then there exists a nonzero vector v such that
ABv = Xv. Set w = Bv. Since \ # and v ¥= 0,
Aw = ABv = \v ¥= and so w #
But w is an eigenvector of BA belonging to the eigenvalue X since
BAw  BABv = B\v = \Bv = \w
Hence X is an eigenvalue of BA. Similarly, any nonzero eigenvalue of BA is also an eigenvalue
of AB.
Thus AB and BA have the same eigenvalues.
9.13. Suppose A. is an eigenvalue of an invertible operator T. Show that A~* is an eigenvalue
of TK
Since T is invertible, it is also nonsingular; hence by Problem 9.11, X # 0.
By definition of an eigenvalue, there exists a nonzero vector i; for which T(v) — Xv. Apply
ing ri to both sides, we obtain v = Ti(\v) = xri(i;). Hence ri(v) = Xiv; that is, X"!
is an eigenvalue of r~i.
9.14. Prove Theorem 9.3: Let Vi, . . .,Vn be nonzero eigenvectors of an operator T:V >V
belonging to distinct eigenvalues Ai, . . . , A„. Then vi, . . .,Vn are linearly independent.
The proof is by induction on n. Ii n — 1, then Vj is linearly independent since Vi # 0.
Assume « > 1. Suppose
OiV, I 02^2 + • • • + a„v„ = {1)
where the Oj are scalars. Applying T to the above relation, we obtain by linearity
aiT(Vi) + <hT(v.,) + ■■■ + a„T(v„) = T{0) ^
But by hypothesis r(i;j) = XjVj; hence
ajXiVi + 02X2^2 + • • • + ttn^n^n = (2)
210
EIGENVALUES AND EIGENVECTORS
[CHAP. 9
On the other hand, multiplying (1) by X„,
Now subtracting (5) from (2),
ai(Xi  Xj^i + a2{\2~K)'"2 + • • • + a„_i(\„_i  Xjiini =
By induction, each of the above coefficients is 0. Since the Xj are distinct, Xj — X„ 9^ for i # w.
Hence aj = • • • = a^i = 0. Substituting this into (1) we get a„i;„ = 0, and hence a„ = 0. Thus
the Vi are linearly independent.
CHARACTERISTIC POLYNOMIAL, CAYLEYHAMILTON THEOREM
9.15. Consider a triangular matrix
'an ai2
a22
... CUnj
Find its characteristic polynomial A{t) and its eigenvalues.
Since A is triangular and tl is diagonal, tl — A is also triangular with diagonal elements t — ««:
/*«!! <»12 • • •
t 0.22
a
In
"in
tanni
tl  A ^
\
Then A(t) = t/ — A is the product of the diagonal elements t — ««:
A(t) = (t  aii)(t  a22) •••(* «nn)
Hence the eigenvalues of A are an, tt22, . . . , ««„, i.e. its diagonal elements
1 2 3\
9.16. Let A = I 2 3 . Is A similar to a diagonal matrix? If so, find one such matrix.
3/
Since A is triangular, the eigenvalues of A are the diagonal elements 1, 2 and 3. Since they
are distinct, A is similar to a diagonal matrix whose diagonal elements are 1, 2 and 3; for example,
/i o^
2
\0 3;
9.17. For each matrix find a polsoiomial having the matrix as a root
/l 43
« ^ = (i 3> <«)« = (' :4)' <'") ^ = » \\
■I
By the CayleyHamilton theorem every matrix is a root of its characteristic polynomial.
Therefore we find the characteristic polynomial A(t) in each case.
(i) A(t) = \tIA\ =
(ii) A(«) == \tIB\ =
(iii) A(t) = \tIC\ =
t2 B
1 f + 3
t2 3
7 t + 4
+ t  11
= (2 + 2t + 13
t1 4 3
t  3 1
2 t+1
(t  l)(t2 2t5)
CHAP. 9] EIGENVALUES AND EIGENVECTORS 211
9.18. Prove the CayleyHamilton Theorem 9.5: Every matrix is a zero of its characteristic
polynomial.
Let A be an arbitrary wsquare matrix and let A(i) be its characteristic polynomial; say,
A(t) = \tIA\ = <«+ a„_it"i +••■+«!* + do
Now let B(t) denote the classical adjoint of the matrix tl — A. The elements of B{t) are cofactors
of the matrix tl — A and hence are polynomials in t of degree not exceeding n — 1. Thus
B(t) = B„_it"i + ■ ■ ■ + B^t + Bo
where the Bj are resquare matrices over K which are independent of t. By the fundamental property
of the classical adjoint (Theorem 8.8),
{tIA)B{t)  \tIA\I
or (i/i4)(B„_it«i+ ••• +Bi« + Bo) = (<" + a„_it»i+ lai« + aoK
Removing parentheses and equating the coefficients of corresponding powers of t,
Bnl = 1
Bn2~AB„_i — a„il
Bo  ABi = a,/
ABo = agl
Multiplying the above matrix equations by A", A"^, . . ., A, I respectively,
A"Bni = A"
AniB„_2  A«J?„_i 3^ a„iA»~i
A»2B„_3A«iB„_2 = a„_2A"2
ABo  A^Bi  a^A
—ABo = »o^
Adding the above matrix equations,
= A» + a„_iA«i + • • • + aiA + oo/
In other words, A(A) = 0. That is, A is a zero of its characteristic polynomial.
9.19. Show that a matrix A and its transpose A' have the same characteristic polynomial.
By the transpose operation, (t/A)' = tl* — A*  tl  A*. Since a matrix and its transpose
have the same determinant, \tI — A\  (t7A)* := \tIA*\. Hence A and A* have the same char
acteristic polynomial.
/Ai B\
9.20. Suppose M = \ ^ ^1 where Ai and Az are square matrices. Show that the char
acteristic polynomial of M is the product of the characteristic poljmomials of Ai and
Ai. Generalize.
/tlA, ~B \
tlM = I tlA/' ^®"*=^ ^y Problem 8.70, \tIM\ =
t/ — A \tl — B\, as required.
By induction, the characteristic poljmomial of the triangular block matrix
tl  Ai B
tl  A^
212
EIGENVALUES AND EIGENVECTORS
[CHAP. 9
M =
lA, B
A2
C
D
\ ... A^l
where the Aj are square matrices, is the product of the characteristic polynomials of the Aj.
MINIMUM POLYNOMIAL
9.21. Find the minimum polynomial m{t) of A =
The characteristic polynomial of A is
2
1
2
1
1
2
4
A(f) =
t2 1
t2
t 1
t2. 1
t2
t1 1
2 i4
= (t3)(t2)3
The minimum polynomial m{t) must divide A(i). Also, each irreducible factor of A(t), i.e. t — 2
and < — 3, must be a factor of m{t). Thus m(t) is exactly one of the following:
/(t) = (t3)(t2), flr(t) = (i3)(«2)2, /i(«) = (t3)(t2)3
We have
/(A) = (A37)(A2/) =
ff(A) = (A  37)(A  27)2 =
1
1
o\
1'
1
1
'0
2
1
\:
1
1
2
1/
2
2
1
1
o\
lo
1
1
2
1
1
1
¥
=
02
,0
Thus g{t) = (t — 3)(t — 2)2 is the minimum polynomial of A.
Remark. We know that h{A) — A(A) =; by the CayleyHamilton theorem. However, the degree
of g{t) is less than the degn:ee of h(t); hence g(f), and not h(t), is the minimum poljmomial of A.
9.22. Find the minimal polynomial m{t) of each matrix (where a¥^Oy.
^ \o */ \o A,
(i) The characteristic polynomial of A is A(t)  {t — X)2. We find A — \/ # 0; hence m{t) —
A(t) = (t\)K
(ii) The characteristic polynomial of B is A(i) = (t — X)^. (Note m(t) is exactly one of t — \, {t — X)2
or {t  X)3.) We find (B  X/)2 ¥ 0; thus OT(t) = A(t) = (t  X)S.
(iii) The characteristic polynomial of C is A(t) = (t  X)*. We find (C  X/)3 # 0; hence m{t) =
A(t) = (tX)'«.
CHAP. 9]
EIGENVALUES AND EIGENVECTORS
213
9.23.
(A 0\
Let M = I Q g 1 where A and B are square matrices. Show that the minimum
polynomial m(f) of M is the least common multiple of the minimum polynomials g(f)
and fe(t) of A and B respectively. Generalize.
"m,(A.)
m(B)^
Since m(V) is the minimum polynomial of M, m(Af) =
= and hence ■m(A) =
and mifi) = 0. Since g{t) is the minimum polynomial of A, g{t) divides m{t). Similarly, h{t)
divides m{t). Thus m(t) is a multiple of g(t) and h(t).
/f(A) \ /O 0^
Now let f{t) be another multiple of g{t) and h(t); then /(M) =
= 0.
V /(B),
But m{t) is the minimum polynomial of M; hence m(t) divides /(t). Thus m{t) is the least common
multiple of g{t) and /i(t).
We then have, by induction, that the minimum polynomial of
M =
\o
where the Aj are square matrices, is the least common multiple of the minimum polynomials of
the A,.
9.24. Find the minimum polynomial m(i) of
'2
M
Let A
2 8
2
c
D = (5). The minimum polynomials of
A, C and D are (t — 2)^, t2 and t — 5 respectively. The characteristic polynomial of B is
\tIB\ =
t 4: 2
1 t3
t2
7t + 10 = (t2)(t5)
and so it is also the minimum polynomial of B.
'\
Observe that M =
'A
B
C
,0 2?/
polynomials of A, B, C and D. Accordingly, m(t) = tm  2)2(t — 5)
Thus m(t) is the least common multiple of the minimum
9.25. Show that the minimum polynomial of a matrix (operator) A exists and is unique.
By the CayleyHamilton theorem, A is a zero of some nonzero polynomial (see also Problem 9.31).
Let n be the lowest degree for which a polynomial f(t) exists such that /(A) = 0. Dividing f(t) by
its leading coefficient, we obtain a monic polynomial m(t) of degree n which has A as a zero. Sup
pose m'(t) is another monic polynomial of degree n for which m'{A) = 0. Then the difference
m{t) — m'(t) is a nonzero polynomial of degree less than n which has A as a zero. This contradicts
the original assumption on n; hence m(t) is a unique minimum polynomial.
214 EIGENVALUES AND EIGENVECTORS [CHAP. 9
9.26. Prove Theorem 9.10: The minimum polynomial m(t) of a matrix (operator) A
divides every polynomial which has A as a zero. In particular, m{t) divides the char
acteristic polynomial of A.
Suppose f(t) is a polynomial for which f{A) = 0. By the division algorithm there exist poly
nomials q{t) and r{t) for which f{t) = m{t) q(t) + r(t) and r(t) = or deg r(t) < deg m(t). Sub
stituting t = A in this equation, and using that f(A) = and 'm{A) = 0, we obtain r(A) = 0. If
r{t) ¥= 0, then r(t) is a polynomial of degree less than m(t) which has A as a zero; this contradicts
the definition of the minimum polynomial. Thus r(t) = and so f{t) = m{t) q(t), i.e. m(t) divides /(*)•
9.27. Let m{t) be the minimum polynomial of an %square matrix A. Show that the char
acteristic polynomial of A divides (m(i))''.
Suppose m(t) = f + Cjfi  • • ■ I c^it + c^. Consider the following matrices:
Bo = /
Bi = A + cj
Bo = A^ + CjA + Cg/
B^_i = A"! + CiA'2 4 • ■ • + c^_i7
Then B^ = I
Bi  ABg = cj
B^AB^ = C2I
Also, AB^^i = C^  {Ar+CiAri+ ■■■ +CriA+CrI)
= c^ — «i(A)
= c^I
Set B{t) = fi^o + f^Bi + • • ■ + tBr2 + Br1
Then
{tIA)B(t) = (t'Bo+t''^Bi+ •■• +tBri)  (t'^ABo + tr^ABi+ ••• +ABri)
= t^Bo+ tri{BiABo)+ t'2(B2ABi)+ •■■ f t(B,._i  AB^a)  AB^i
= f/ 1 Cifl/ + C^f^I + ••• + Critl + C^
= m(t)I
The determinant of both sides gives \tl — A\ \B{t)\ = \m(t) I\ = (TO(t))". Since \B(t)\ is a polynomial,
1*7 — A I divides (m(t))"; that is, the characteristic polynomial of A divides (»n(t))".
9.28. Prove Theorem 9.11: The characteristic polynomial A{t) and the minimum poly
nomial m{t) of a matrix A have the same irreducible factors.
Suppose f{t) is an irreducible polsmomial. If f{t) divides m{t) then, since m(t) divides A(t), f(t)
divides A(t). On the other hand, if f(t) divides A(t) then, by the preceding problem, /(«) divides
(m(t))^. But f(t) is irreducible; hence f(t) also divides m{t). Thus m{t) and A(t) have the same
irreducible factors.
9.29. Let r be a linear operator on a vector space V of finite dimension. Show that T is
invertible if and only if the constant term of the minimal (characteristic) polynomial
of T is not zero.
Suppose the minimal (characteristic) polynomial of T is f(t) = f + a„_it'— 1 I • • • + a^t + a^f.
Each of the following statements is equivalent to the succeeding one by preceding results: (1) T is
invertible; (ii) T is nonsingular; (iii) is not an eigenvalue of T; (iv) is not a root of m(t); (v) the
constant term Uf, is not zero. Thus the theorem is proved.
CHAP. 9]
EIGENVALUES AND EIGENVECTORS
215
9.30. Suppose dimF = ». Let T:V ^V he an invertible operator. Show that T* is
equal to a polynomial in T of degree not exceeding n.
Let m(t) be the minimal polynomial of T. Then m{t) = f + a^if— i + • • • + a^t + a^, where
r — n. Since T is invertible, Kq # 0. We have
Hence
m(r) = y + a^_ir'i +
+ aJ)T = I and
ri =
OiT + do/ =
1
tto
(Tri + ari r'2 + • • • + aj)
MISCELLANEOUS PROBLEMS
9.31. Let T be a linear operator on a vector space V of dimension n. Without using the
CayleyHamilton theorem, show that T is a zero of a nonzero polynomial.
Let N  rfi. Consider the following N\l operators on V: I, T,T^, .... T^. Recall that the
vector space A(V) of operators on V has dimension N — rfi. Thus the above iV + 1 operators are
linearly dependent. Hence there exist scalars a^, Oj, . . ., aj, for which a^^T^ + •• • + a^T + a^ = 0.
Accordingly, T is a zero of the polynomial f{t) = a^^t^ + ••• + a^t + Oq.
9.32. Prove Theorem 9.13: Let A be an eigenvalue of an operator T:V ^V. The geometric
multiplicity of X does not exceed its algebraic multiplicity.
Suppose the geometric multiplicity of X is r. Then X contains r linearly independent eigenvectors
Vi, ...,Vr. Extend the set {dJ to a basis of V: {v^, ...,v^, w^, . . .,Wg}. We have
T{vi)
=
XVi
1
1
'(^2)
=
\V2
\Vr)
=
W,
T(w,)
=
aiiVi
+
... +
airVr + 6nWi + •
• • + bi,w.
I
7
'(^2)
=
a2iVi
+
... +
a2rVr + b2tWi + ■
• + b2s'Ws
'(W.)
=
O'sl'Vl
+
... +
O'sr'Or + &SIW1 + •
■ + bss^s
The matrix of T in the above basis is
/^
{ «ii
«21 •■• 0sl\
r.
X
1 «'12
»22 • • • «s2 \
= (^'l
M =
1 .
X
l<Hr
ttgr ■ • • O'sr
A^
»
1 *"
621 ... 6,1
Vo
5/
".
1 &12
622 • • • 6r2
\.
1 hs
hs ■•■ 6ss /
where A = (0.;^)' and B
= (»«)••
By Problem 9.20 the characteristic polynomial of X/,, which is (t — \Y, must divide the char
acteristic polynomial of M and hence T. Thus the algebraic multiplicity of X for the operator T is
at least r, as required.
9.33. Show that A =
is not diagonalizable.
The characteristic polynomial of A is A(t) = (t — 1)^; hence 1 is the only eigenvalue of A. We
find a basis of the eigenspace of the eigenvalue 1. Substitute t = 1 into the matrix tl — A to obtain
the homogeneous system
216 EIGENVALUES AND EIGENVECTORS [CHAP. 9
o)(^) = (I) «'• { =
The system has only one independent solution, e.g. x = 1, y = 0. Hence u — (1, 0) forms a basis
of the eigenspace of 1.
Since A has at most one independent eigenvector, A cannot be diagonalized.
9.34. Let F be an extension of a field K. Let A be an wsquare matrix over K. Note that
A may also be viewed as a matrix A over F. Clearly \tl ~ A\ = \tl  A\, that is, A
and A have the same characteristic polynomial. Show that A and A also have the
same minimum polynomial.
Let m{t) and m'(t) be the minimum polynomials of A and A respectively. Now m'{t) divides
every polynomial over F vifhich has A as a zero. Since m(t) has A as a zero and since m{t) may be
viewed as a polynomial over F, m'{t) divides m(t). We show now that m(t) divides m'(t).
Since m'(t) is a polynomial over F which is an extension of K, we may write
m'(t) = fi(t)hi + /aft) 62 + ■•• + fnit)b„
where /i(«) are polynomials over K, and 61, . . . , 6„ belong to F and are linearly independent over K.
We have
m'{A) = /i(A)bi + /2(A)62 + ••• + /„(A)6„ = (1)
Let alP denote the yentry of /fc(A). The above matrix equation implies that, for each pair {i,j),
a</'6i + aif 62 + • • ■ + a^f &„ =
Since the 64 are linearly independent over K and since the a\P S K, every tty''' = 0. Then
/i(A) = 0, /2(A) = 0, ..., /„(A) =
Since the /i(t) are polynomials over K which have A as a zero and since m(t) is the minimum poly
nomial of A as a matrix over K, m(t) divides each of the /;(«)• Accordingly, by (1), m(t) must also
divide m'(t). But monic polynomials which divide each other are necessarily equal. That is,
m(t) = tn'{t), as required.
9.35. Let {vi, . . . , v„} be a basis of 7. Let T : F * 7 be an operator for which T{vi) = 0,
T{V2) = chiVi, T{vs)  aaivi + asiVi, . . ., Tivn) = a„iVi + • • • + an,niv„i. Show that
T" 0.
It suffices to show that
Ti{Vj) = (*)
for i — 1, . . .,n. For then it follows that
THvj)  T»i(Ti(Vj)) = r"5(0) = 0, for ^ = 1 n
and, since {v^, ...,■«„} is a basis, T" = 0.
We prove {*) by induction on j. The case j = 1 is true by hypothesis. The inductive step
follows (for j = 2, . . .,n} from
Ti{Vj) = TiHT(Vj)) = TSHajiVi+ ■■■+aj^j.iVj^i)
= aj^TiHvi) + ••• +aj.,_irJi(Vj_i)
= ajiO + ■•• + aj,j_iO =
Remark. Observe that the matrix representation of T in the above basis is triangular with
diagonal elements 0:
/O a^i ttsi ... ani
032 .. . a„2
... a„,„_i
\0 ...
CHAP. 9]
EIGENVALUES AND EIGENVECTORS
217
Supplementary Problems
POLYNOMIALS OF MATRICES AND LINEAR OPERATORS
9.36. Let f(t) = 2*2  5t + 6 and g(t) = t^  21^ + t + i. Find f(A), g{A), f(B) and g(B) where
9.37. Let r:E2^R2 be defined by T(x,y) = {x + y,2.x). Let /(i) = t2  2t + 3. ¥\n& f(T)(x,y).
9.38. Let V be the vector space of polynomials v(x) = ax^ + hx + c. Let D:V ^V be the differential
operator. Let f{t) = fi + 2t ~ 5. Find f(D){v{x)).
9.39. Let A
.0
Find A2, A3, A".
'8 12 0^
9.40. Let B = I 8 12
8/
Find a real matrix A such that B = As.
9.41. Consider a diagonal matrix M and a triangular matrix N:
' «! ... \
a.
and
AT
a^ b ... c
02 ... d
\
M = " "2 •
... a J
Show that, for any polynomial f(t), f(M) and f(N) are of the form
'/(tti) ... \ //(ai) X
f(M) = I ^^ f^"'^^ ■■■ and /(AT) =
... f{aj
9.42. Consider a block diagonal matrix M and a block triangular matrix N:
/Ai ... \ /Ai B
M =
A2 ...
\ ... A„,
and
N =
A,
C
D
where the Aj are square matrices. Show that, for any polynomial /(*), f(M) and f{N) are of the form
'/(Ai) ... \ //(Ai) X ... F \
f{M) = I /(A2) ... ^^^ ^(^) ^ I /(A2) ...
... /(A„)/
\ ... /(Aj/
9.43. Show that for any square matrix (or operator) A, (P^iAP)" = PiA"F where P is invertible. More
generally, show that f{P^AP) = pif{A)P for any polynomial f(t).
9.44. Let f{t) be any polynomial. Show that: (i) /(Af) = (/(A))*; (ii) if A is symmetric, i.e. A* = A,
then /(A) is symmetric.
EIGENVALUES AND EIGENVECTORS
9.45. For each matrix, find all eigenvalues and linearly independent eigenvectors:
/22\ /42\ /51
« ^ =(1 3)' (") ^ =(3 3)' () ^=(13
Find invertible matrices Pj, Pg and P3 such that P~iAPi, P^BP^ and P~^CPs are diagonal.
218 EIGENVALUES AND EIGENVECTORS [CHAP. 9
9.46. For each matrix, find all eigenvalues and a basis for each eigenspace:
/3 1 l\ / 1 2 2\ /l 1 0^
(i) A := 2 4 2 , (ii) B = 1 2 1 , (iii) C = 1
\1 1 3/ \l 14/ \0 ly
When possible, find invertible matrices Pj, P2 and P3 such that P~^APi, F^'BPj ^"d Pj^CPs are
diagonal.
9.47. Consider A = I ^ ) and B = I ) as matrices over the real field R. Find all eigen
Vl 4y \1Z 3/
values and linearly independent eigenvectors.
9.48. Consider A and B in the preceding problem as matrices over the complex field C. Find all eigen
values and linearly independent eigenvectors.
9.49. For each of the following operators T .K^ * R^, find all eigenvalues and a basis for each eigen
space: (i) T{x,y) = (3x + 3y,x + 5y); (ii) T{x,y) = (y,x); (iii) T(x,y) = (y,x).
9.50. For each of the following operators T : R^ > R3, find all eigenvalues and a basis for each
eigenspace: (i) T{x, y,z) — (x + y + z, 2y + z, 2y + 3«); (ii) T(,x, y,z) = (x + y,y + z, —2y — z);
(iii) t\x, y, 2) = (xy,2x + 3y + 2z, x^y + 2z).
9.51. For each of the following matrices over the complex field C, find all eigenvalues and linearly
independent eigenvectors:
<"(: ;)• ""(J D «G:r). «(;:?;
9.52. Suppose v is an eigenvector of operators S and T. Show that v is also an eigenvector of the operator
aS + bT where a and 6 are any scalars.
9.53. Suppose v is an eigenvector of an operator T belonging to the eigenvalue X. Show that for n > 0,
V is also an eigenvector of T" belonging to \".
9.54. Suppose X is an eigenvalue of an operator T. Show that /(X) is an eigenvalue of f(T).
9.55. Show that similar matrices have the same eigenvalues.
9.56. Show that matrices A and A* have the same eigenvalues. Give an example where A and A' have
different eigenvectors.
9.57. Let S and T be linear operators such that ST — TS. Let X be an eigenvalue of T and let W be
its eigenspace. Show that W is invariant under S, i.e. S(W) C W.
9.58. Let y be a vector space of finite dimension over the complex field C. Let W # {0} be a subspace
of V invariant under a linear operator T : V * V. Show that W contains a nonzero eigenvector of T.
9.59. Let A be an iisquare matrix over K. Let v^, . . .yV^G K" be linearly independent eigenvectors of
A belonging to the eigenvalues Xj, . . . , X„ respectively. Let P be the matrix whose columns are the
vectors Vi,...,v„. Show that P~^AP is the diagonal matrix whose diagonal elements are the
eigenvalues Xj, . . . , X„.
CHARACTERISTIC AND MINIMUM POLYNOMIALS
9.60. For each matrix, find a polynomial for which the matrix is a root:
'2 3 2^
4 1.
^^ ^ = (4 D' (") ^ = (3 3)' ^"^ '^='[1 i_i
CHAP. 9]
EIGENVALUES AND EIGENVECTORS
219
9.61. Consider the wsquare matrix
A =
Show that /(«) = (t X)™ is both the characteristic and minimum polynomial of A.
9.62. Find the characteristic and minimum polynomials of each matrix:
A =
/2 5 0\
2
4 2
3 5
\0 7/
B
3
1
3
3
1
3
1
3
c =
IX 0\
0X000
0X00
0X0
VO x/
1 1 0\ /2 0\
9.63. Let A = 2 and B = 2 2 . Show that A and B have different characteristic
\0 1/ \0 1/
polynomials (and so are not similar), but have the same minimum polynomial. Thus nonsimilar
matrices may have the same minimum polynomial.
9.64. The mapping T:V *V defined by T{v) = kv is called the scalar mapping belonging to k & K.
Show that T is the scalar mapping belonging to k&K if and only if the minimal polynomial of
T is m(t) = tk.
9.65. Let A be an nsquare matrix for which A^ = Q for some k> n. Show that A" = 0.
9.66. Show that a matrix A and its transpose A* have the same minimum polynomial.
9.67. Suppose f(t) is an irreducible monic polynomial for which f(T) = where T Is a linear operator
T:V ^V. Show that f(f) is the minimal polynomial of T.
9.68. Consider a block matrix M = I V Show that tl — M = ( _ fj _ n ) is the char
acteristic matrix of Af . \^ " / \ ^
9.69. Let r be a linear operator on a vector space V of finite dimension. Let Tf' be a subspace of V
invariant under T, i.e. T(W) cW. Let Tyf.W^W be the restriction of T to W. (i) Show that
the characteristic polynomial of Tt„ divides the characteristic polynomial of T. (ii) Show that the
minimum polynomial of Tyf divides the minimum polynomial of T.
9.70. Let A =
A(«)
ail «i2 <*i3
'*21 "^22 "'•23
, a^i a^2 <^33
Show that the characteristic polynomial of A is
(ail + 022 + a33)f2 +
"ll ''■12
"^21 '*22
"•11 "'13
"31 "33
"22 "23
"32 "33
t
Hi
^■13
"21 "22 "23
"31 032 "S3
9.71. Let A be an msquare matrix. The determinant of the matrix of order n — m obtained by deleting
the rows and columns passing through m diagonal elements of A is called a principal minor of degree
n~m. Show that the coefficient of t™ in the characteristic polynomial A(t) = \tI — A\ is the sum
of all principal minors of A of degree n — m multiplied by (— l)"™. (Observe that the preceding
problem is a special case of this result.)
220
EIGENVALUES AND EIGENVECTORS
[CHAP. 9
9.72. Consider an arbitrary monic polynomial f(t) = t" + a„_if"^i + ••• + a^t + a^. The following
nsquare matrix A is called the companion matrix of f{t):
.
.
—do
1
.
.
«1
1 .
.
«£
.
. 1
ftnl
Show that f{t) is the minimum polynomial of A.
9.73. Find a matrix A whose minimum polynomial is (i) t^  St^ + 6t + 8, (ii) t^  51^  2t + 7t + 4.
DIAGONALIZATION
9.74. Let A = ( , ) be a matrix over the real field R. Find necessary and sufficient conditions on
\c dj
a, b, c and d so that A is diagonalizable, i.e. has two linearly independent eigenvectors.
9.75. Repeat the preceding problem for the case that A is a matrix over the complex field C.
9.76. Show that a matrix (operator) is diagonalizable if and only if its minimal polynomial is a product
of distinct linear factors.
9.77. Let A and B be Jisquare matrices over K such that (i) AB = BA and (ii) A and B are both
diagonalizable. Show that A and B can be simultaneously diagonalized, i.e. there exists a basis of
K^ in which both A and B are represented by diagonal matrices. (See Problem 9.57.)
9.78. Let E iV ^V be a projection operator, i.e. E^ = E. Show that E is diagonalizable and, in fact,
can be represented by the diagonal matrix A = i ^ j where r is the rank of E.
Answers to Supplementary Problems
26 3
9.36. f(A) = ( 5 _27 ) . 9{A)
40 39\ /3 6\ /3 12
65 27)' /(^)=(o ,)' ^(^) = (o 15
9.37. f(T)(x, y) = (4a;  y, 2x + 5y).
9.38. f(D){v(x)) = 5aa;2 + (4a  5b)x + (2a + 26  5c).
"'• ': I)' ^'(i t)' ^ii:
'2 a 61
9.40. Hint. Let A = \ 2 c ) . Set B = A^ and then obtain conditions on a, 6 and c.
.0 2j
9.44. (ii) Using (i), we have (/(A))« = /(A«) = /(A).
9.45. (i) Xi = 1, M = (2, 1); Xg = 4, u = (1, 1).
(ii) Xi = 1, M = (2, 3); X2 = 6, 1; = (1, 1).
(iii) \ = 4, u= (1,1).
2 1\ / 2 1\
= ( 1. P3 does not exist since C has only one independent
eigenvector, and so cannot be diagonalized.
Let Pi = ( _^ J ) and P2
CHAP. 9] EIGENVALUES AND EIGENVECTORS 221
9.46. (i) Xi = 2, M = (1, 1, 0), V = (1, 0, 1); Xj = 6, w = (1, 2, 1).
(ii) Xi = 3, M = (1, 1, 0), V = (1, 0, 1); X2 = 1, w = (2, 1, 1).
(iii) X = 1, tt = (1, 0, 0), V = (0, 0, 1).
/ 1 1 1\ /I 1 2\
Let Pj = — 1 2 and P2 — \ 1 — 1 I . P3 does not exist since C has at most two
\ 1 1/ \0 1 1/
linearly independent eigenvectors, and so cannot be diagonalized.
9.47. (i) X = 3, M = (1,1); (ii) B has no eigenvalues (in R).
9.48. (i) X = 3, M = (1, 1). (ii) Xj = 2i, u = (1, 3  2i); Xg = 2i, i; = (1, 3 + 2i>
9.49. (i) Xi = 2, M = (3, 1); X2 = 6, 1) =: (1, 1). (ii) Xi = 1, m = (1, 1); X2 = 1, v = (1, 1). (iii) There
are no eigenvalues (in B).
9.50. (i) Xi = 1, M = (1, 0, 0); Xj = 4, 1; = (1, 1, 2).
(ii) X = 1, M = (1, 0, 0). There are no other eigenvalues (in R).
(iii) Xi = 1, M = (1, 0, 1); X2 = 2, ■!; = (2, 2, 1); X3 = 3, w = (1, 2, 1).
9.51. (i) Xi = 1, M = (1,0); X2 = i, 1) = (1,1 + i). (ii) X = 1, m = (1,0). (iii) Xj = 2, u = (3,i); Xj = 2,
V = (1, t). (iv) Xi = t, M = (2, 1  i); Xg = — i, v = (2, 1 + i).
9.56. Let A = ( ^ ) . Then X = 1 is the only eigenvalue and v = (1, 0) generates the eigenspace
\^ '' /I 0\
of X = 1. On the other hand, for A* = ( j , X = 1 is still the only eigenvalue, but w = (0, 1)
generates the eigenspace of X = 1. ^
9.57. Let v G W, and so T{v) = Xv. Then T(Sv) = S(Tv)  S(\v) = \(Sv), that is, Sv is an eigenvector
of T belonging to the eigenvalue X. In other words, Sv e W and thus S(W) C W.
9.58. Let T:W*W be the restriction of T to W. The characteristic polynomial of T is a polynomial
over the complex field C which, by the fundamental theorem of algebra, has a root X. Then X is an
eigenvalue of T, and so T has a nonzero eigenvector in W which is also an eigenvector of T.
9.59. Suppose T(v) = Xv. Then {kT){v)  kT{v) = k(\v) = (k\)v.
9.60. (i) /(<) = t^St + 43, (ii) g(t) ^t^8t + 23, (iii) h(t) = t^  6*2 + 5f  12.
9.62. (i) A(t) = (t2)3(t7)2; m(f) = (t2)2(t7). (ii) Ht) = (<3)«; m(<) = (t3)». (iii) A(«) =
(tX)5; m(«) = «X.
/O 8\
9.73. Use the result of Problem 9.72. (i) A = h 6 , (ii) A =
\0 1 5/
9.77. Hint. Use the result of Problem 9.57.
chapter 10
Canonical Forms
INTRODUCTION
Let r be a linear operator on a vector space of finite dimension. As seen in the preceding
chapter, T may not have a diagonal matrix representation. However, it is still possible
to "simplify" the matrix representation of T in a number of ways. This is the main topic
of this chapter. In particular, we obtain the primary decomposition theorem, and the
triangular, Jordan and rational canonical forms.
We comment that the triangular and Jordan canonical forms exist for T if and only if
the characteristic polynomial A{t) of T has all its roots in the base field K. This is always
true if K is the complex field C but may not be true if K is the real field R.
We also introduce the idea of a quotient space. This is a very powerful tool and will be
used in the proof of the existence of the triangular and rational canonical forms.
TRIANGULAR FORM
Let r be a linear operator on an ndimensional vector space V. Suppose T can be rep
resented by the triangular matrix
(an ai2 ... ttin \
(122 . . . 0,2n
ann I
Then the characteristic polynomial of T,
A{t) = */A = {t  an){t  a^i) . . . [t  ann)
is a product of linear factors. The converse is also true and is an important theorem;
namely,
Theorem 10.1: Let T:V^V be a linear operator whose characteristic poljmomial factors
into linear polynomials. Then there exists a basis of V in which T is
represented by a triangular matrix.
Alternate Form of Theorem 10.1: Let A be a square matrix whose characteristic poly
nomial factors into linear polynomials. Then A is
similar to a triangular matrix, i.e. there exists an in
vertible matrix P such that P'^AP is triangular.
We say that an operator T can be brought into triangular form if it can be represented
by a triangular matrix. Note that in this case the eigenvalues of T are precisely those
entries appearing on the main diagonal. We give an application of this remark.
222
CHAP. 10]
CANONICAL FORMS
223
Example 10.1 : Let A be a square matrix over the complex field C. Suppose X is an eigenvalue of A2.
Show that a/x or Vx is an eigenvalue of A. We know by the above theorem that
A is similar to a triangular matrix
/Ml
B =
Hence A^ is similar to the matrix
52 =
1^2
/.?
Since similar matrices have the same eigenvalues, X = ya? for some i. Hence
/jj = VX or ^j = — y/x; that is, Vx or  Vx is an eigenvalue of A.
INVARIANCE
Let T:V*V be linear. A subspace IF of 7 is said to be invariant under T or
Tinvariant if T maps W into itself, i.e. if vGW implies T{v) G W. In this case T
restricted toW defines a linear operator on W; that is, T induces a linear operator f:W*W
defined by T{w) = T{w) for every w GW.
Example 10.2: Let T : K^ ^ R3 be the linear operator which rotates each vector about the z axis
by an angle e:
T(x, y, z) = {x cose — y sin e, x sine + y cos e, z)
Observe that each vector w =  T(v)
{a, b, 0) in the xy plane W remains .fj ^
in W under the mapping T, i.e. W ^ '
is rinvariant. Observe also that
the z axis U is invariant under T.
Furthermore, the restriction of T
to W rotates each vector about the
origin O, and the restriction of T
to TJ is the identity mapping on U.
Example 10.3: Nonzero eigenvectors of a linear operator T : V ^ V may be characterized as gen
erators of Tinvariant 1dimensional subspaces. For suppose T{^v) = \v, v # 0.
Then W = {kv, k e K}, the 1dimensional subspace generated by v, is invariant
under T because
T{kv) = k T(v) = k(\v) = kWEiW
Conversely, suppose dim 17 = 1 and m ^ generates U, and U is invariant under
T. Then T{u) e V and so T(u) is a multiple of u, i.e. T(u) = /lu. Hence u is an
eigenvector of T.
The next theorem gives us an important class of invariant subspaces.
Theorem 10.2: Let T:V^V be linear, and let f{t) be any polynomial. Then the kernel
of f{T) is invariant under T.
The notion of invariance is related to matrix representations as follows.
Theorem 10.3: Suppose W is an invariant subspace of T:V^V. Then T has a block
A B'
matrix representation [ q r^ ] where A is a matrix representation of
the restriction of T to W.
224
CANONICAL FORMS [CHAP. 10
INVARIANT DIRECTSUM DECOMPOSITIONS
A vector space V is termed the direct sum of its subspaces Wi, . . .,Wr, written
if every vector v GV can be written uniquely in the form
1) = wi + W2+ • • • + iVr with Wi G Wi
The following theorem applies.
Theorem 10.4: Suppose Wi, ...,Wr are subspaces of V, and suppose
{Wn, . . . , Wm^}, . . ., {WrU . . . , tVrn^}
are bases of Wi,...,Wr respectively. Then V is the direct sum of the
Wi if and only if the union {wn, . . .,wi„i, . . ..wn, . . .,w™J is a basis
of V.
Now suppose T.V^V is linear and V is the direct sum of (nonzero) Tinvariant
subspaces Wi, . . .,Wr:
V = Wi® ■■■ ®Wr and T{Wi) cWi, i^l,...,r
Let Ti denote the restriction of T to Wi. Then T is said to be decomposable into the operators
Ti or T is said to be the direct sum of the Ti, written T = Ti © • • • Tr. Also, the sub
spaces Wi, ...,Wr are said to redvxe Tor to form a Tinvariant directsum decomposition of F.
Consider the special case where two subspaces U and W reduce an operator T:V^V;
say, dim C/ = 2 and dim W = S and suppose {ui, u^} and (wi, W2, ws) are bases of [/ and
W respectively. If Ti and T2 denote the restrictions of T to C7 and W respectively, then
T2{wi) = bnWi + h\2W2 + bi3W3
Ti (ui) = anUi + ai2U2 „ , , , , r, ,1,
'^ ' T2{W2) = &21W1 + &22W2 + &23W3
Tl (U2) = 0.21^1 + a22U2 rr, , X I. , U I J> .„
^ ^ T2(W3) = bsiWl + hz2W2 + O33W3
Hence
'&n &21 ^31
A = f "" "'' ^ and B = I 612 ^22 b
an a2i
ttl2 0^22
I 613 &23 ^"33 1
are matrix representations of Ti and Ta respectively. By the above theorem {mi, M2, wi, W2, wz)
is a basis of V. Since r(tti) = T,{Ui) and r(Wi) = r2(Wj), the matrix of T in this basis is
the block diagonal matrix I „ 1 .
A generalization of the above argument gives us the following theorem.
Theorem 10.5: Suppose T:V^V is linear and V is the direct sum of Tinvariant sub
spaces Wu •■•, Wr. If Ai is a matrix representation of the restriction of
T to Wi, then T can be represented by the block diagonal matrix
M
[Ai ...
A2 ...
... A,
The block diagonal matrix M with diagonal entries Ai, . . ., A. is sometimes called the
direct sum of the matrices Ai, . . . , Ar and denoted by M = Ai © • • • © Ar.
CHAP. 10] CANONICAL FORMS 225
PRIMARY DECOMPOSITION
The following theorem shows that any operator T:V^V is decomposable into oper
ators whose minimal polynomials are powers of irreducible pols^omials. This is the first
step in obtaining a canonical form for T,
Primary Decomposition Theorem 10.6: Let T:V^V be a linear operator with minimal
polynomial
m{t) = /i(f)">/2(t)"^... /.(*)"'
where the fi{f) are distinct monic irreducible polynomials. Then V is the
direct sum of Tinvariant subspaces Wi, . . .,Wr where Wi is the kernel of
fi{T)"'. Moreover, /i(;t)«i is the minimal polynomial of the restriction of
T to Wi.
Since the polynomials /i(i)"* are relatively prime, the above fundamental result follows
(Problem 10.11) from the next two theorems.
Theorem 10.7: Suppose T:V^V is linear, and suppose f{t) = git)h(t) are polynomials
such that f{T) = and g{t) and h(t) are relatively prime. Then V is the
direct sum of the Tinvariant subspaces U and W, where U = Ker g{T)
and W = Ker h{T).
Theorem 10.8: In Theorem 10.7, if f{t) is the minimal polynomial of T [and g{t) and h{t)
are monic], then g{t) and h{t) are the minimal polynomials of the restric
tions of T to U and W respectively.
We will also use the primary decomposition theorem to prove the following useful
characterization of diagonalizable operators.
Theorem 10.9: A linear operator T .V ^V has a diagonal matrix representation if and
only if its minimal polynomial m{t) is a product of distinct linear
polynomials.
Alternate Form of Theorem 10.9: A matrix A is similar to a diagonal matrix if and only
if its minimal polynomial is a product of distinct linear polynomials.
Example 10.4: Suppose A # 7 is a square matrix for which A^ = I. Determine whether or not
A is similar to a diagonal matrix if A is a matrix over (i) the real field R, (ii) the
complex field C.
Since A^  I, A is a zero of the polynomial f(t) = t^1 = {t l){t^ +t + l).
The minimal polynomial m(t) of A cannot be t — 1, since A ¥' I. Hence
m{t) = t2 + t + 1 or m(t) = t^  X
Since neither polynomial is a product of linear polynomials over R, A is not diag
onalizable over R. On the other hand, each of the polynomials is a product of distinct
linear polynomials over C. Hence A is diagonalizable over C.
NILPOTENT OPERATORS
A linear operator T : F » V is termed nilpotent if T" = for some positive integer n;
we call k the index of nilpotency of T if T'' — but T''^ ¥= 0. Analogously, a square matrix
A is termed nilpotent if A" = for some positive integer n, and of index fc if A'' = but
yj^ki ^ Clearly the minimum polynomial of a nilpotent operator (matrix) of index k is
m{t) — f"; hence is its only eigenvalue.
The fundamental result on nilpotent operators follows.
Theorem 10.10: Let T:V^V be a nilpotent operator of index k. Then T has a block
diagonal matrix representation whose diagonal entries are of the form
226
CANONICAL FORMS
[CHAP. 10
1
.
.
1 .
.
.
.
1
.
.
N
(i.e. all entries of A^ are except those just above the main diagonal where
they are 1). There is at least one N of order k and all other N are of orders
^ k. The number of N of each possible order is uniquely determined by
T. Moreover, the total number of N of all orders is equal to the nullity
of T.
In the proof of the above theorem, we shall show that the number of N of order i is
2mi — mi+i — Mi 1, where mj is the nullity of T\
We remark that the above matrix N is itself nilpotent and that its index of nilpotency is
equal to its order (Problem 10.13). Note that the matrix N of order 1 is just the 1 x 1 zero
matrix (0).
JORDAN CANONICAL FORM
An operator T can be put into Jordan canonical form if its characteristic and minimal
polynomials factor into linear polynomials. This is always true if K is the complex field C.
In any case, we can always extend the base field Z to a field in which the characteristic
and minimum polynomials do factor into linear factors; thus in a broad sense every operator
has a Jordan canonical form. Analogously, every matrix is similar to a matrix in Jordan
canonical form.
Theorem 10.11:
Let T:V >■¥ be a linear operator whose characteristic and minimum
polynomials are respectively
A{t) = (t Ai)"' ...(* XrY' and m{t) = (i  Ai)"' ...{t Xr)™
where the Ai are distinct scalars. Then T has a block diagonal matrix
representation / whose diagonal entries are of the form
/A; 1 ... 0\
Ai 1 ...
«/ ij —
Ai
Ai/
For each A. the corresponding blocks Ja have the following properties:
(i) There is at least one Ja of order mi; all other Ja are of order ^ mi.
(ii) The sum of the orders of the Ja is m.
(iii) The number of Ja equals the geometric multiplicity of Ai.
(iv) The number of Ja of each possible order is uniquely determined by T.
The matrix J appearing in the above theorem is called the Jordan canonical form of the
operator T. A diagonal block Ja is called a Jordan block belonging to the eigenvalue Ai.
Observe that
Ai ...
Ai ...
+
... Ai '
. . . Ai
Ai
1
.
.
^\
Ai
1 .
.
.
. Ai
1
..
.
A J
1
..
1
..
..
1
..
CHAP. 10] CANONICAL FORMS 227
That is,
Jtj = Xil + N
where N is the nilpotent block appearing in Theorem 10.10. In fact, we prove the above
theorem (Problem 10.18) by showing that T can be decomposed into operators, each the sum
of a scalar and a nilpotent operator.
Example 105: Suppose the characteristic and minimum polynomials of an operator T are respec
tively
A(«) = (f2)4(t3)3 and m{t) = («2)2(t3)2
Then the Jordan canonical form of T is one of the following matrices:
or
The first matrix occurs if T has two independent eigenvectors belonging to its eigen
value 2; and the second matrix occurs if T has three independent eigenvectors be
longing to 2.
CYCLIC SUBSPACES
Let r be a linear operator on a vector space V of finite dimension over K. Suppose
V GV and v ^ 0. The set of all vectors of the form f{T){v), where f{t) ranges over all
polynomials over K, is a Tinvariant subspace of V called the Tcyclic subspace of V gen
erated by v;we denote it by Z{v, T) and denote the restriction of T to Z{v, T) by r„. We
could equivalently define Z{v,T) as the intersection of all Tinvariant subspaces of V
containing v.
Now consider the sequence
V, T{v), T\v), T\v), . . .
of powers of T acting on v. Let k be the lowest integer such that T''{v) is a linear com
bination of those vectors which precede it in the sequence; say,
T^iv) = ttfci T'^^v)  ...  aiT{v)  aov
Then m„(i) = t" + akit^^ + ■ ■ ■ + ait + ao
is the unique monic polynomial of lowest degree for which mv(T) (v) = 0. We call m„(i) the
Tannihilator of v and Z{v, T).
The following theorem applies.
Theorem 10.12: Let Z(v, T), T^ and m„(i) be defined as above. Then:
(i) The set {v, T{v), ..., r'=i (v)} is a basis of Z{v, T); hence dim Z{v, T) = fe.
(ii) The minimal polynomial of T„ is m„(i).
(iii) The matrix representation of Tv in the above basis is
228
CANONICAL FORMS
[CHAP. 10
.
.
— tto
1
.
.
ai
1
.
.
tti
.
.
— aic2
.
. 1
— aici
The above matrix C is called the companion matrix of the polynomial m„(t).
RATIONAL CANONICAL FORM
In this section we present the rational canonical form for a linear operator T:V^V.
We emphasize that this form exists even when the minimal polynomial cannot be factored
into linear polynomials. (Recall that this is not the case for the Jordan canonical form.)
Lemma 10.13: Let T:V*V be a linear operator whose minimal polynomial is /(*)" where
f{t) is a monic irreducible polynomial. Then V is the direct sum
V = Z{vi, T) © • • • e Zivr, T)
of Tcyclic subspaces Z{Vi, T) with corresponding Tannihilators
/(*)"', /(«)"^ ■■, fit)"', n = Ml ^ %2  • • •  Wr
Any other decomposition of V into jTcyclic subspaces has the same number
of components and the same set of Tannihilators.
We emphasize that the above lemma does not say that the vectors vi or the Tcyclic sub
spaces Zivi, T) are uniquely determined by T; but it does say that the set of Tannihilators
are uniquely determined by T. Thus T has a unique matrix representation
\ Cr
where the d are companion matrices. In fact, the Ci are the companion matrices to the
polynomials /(*)"*.
Using the primary decomposition theorem and the above lemma, we obtain the following
fundamental result.
Theorem 10.14: Let T:V^V be a linear operator with minimal polynomial
m{t) = fi{tr^f2{tr ... fsitr
where the /{(«) are distinct monic irreducible polynomials. Then T has a
unique block diagonal matrix representation
'Cn \
Clrj
Cs
where the C« are companion matrices. In particular, the C« are the com
panion matrices of the polynomials /i(t)"« where
mi = nil — ni2
= ni:
\'
rris = TCsl — ^52 — • • • — Msr.
CHAP. 10]
CANONICAL FORMS
229
The above matrix representation of T is called its rational canonical form. The poly
nomials /i(i)»" are called the elementary divisors of T.
Example 10.6: Let V be a vector space of dimension 6 over R, and let T be a linear operator whose
minimal polynomial is m{t) = (t^t + 3)(«  2)2. Then the rational canonical form
of T is one of the following direct sums of companion matrices:
(i) C(t2( + 3) e C(«2t + 3) © C((<2)2)
(ii) C{f2t + 3) © C((t2)2) © C((t 2)2)
(iii) C(t2t + 3) © C((t2)2) © C(f2) © C(t2)
where C(f(t)) is the companion matrix of /(t); that is,
/O
3 I
.L__
12
(i)
(ii)
(iii)
QUOTIENT SPACES
Let F be a vector space over a field K and let T7 be a subspace of V. If v is any vector
in V, we write v + W for the set of sums v + w with w GW:
V + W = {V + w: wGW)
These sets are called the cosets of W in V. We show (Problem 10.22) that these cosets
partition V into mutually disjoint subsets.
v + w
Example 10.7: Let W be the subspace of R2 defined
by
W = {(a, b): a=b}
That is, W is the line given by the
equation x — y = 0. We can view
V + W as a, translation of the line,
obtained by adding the vector v to
each point in W. As noted in the
diagram on the right, v +W is also
a line and is parallel to W. Thus
the cosets of W in R2 are precisely
all the lines parallel to W.
In the next theorem we use the cosets of a subspace W of a vector space V to define a
new vector space; it is called the quotient space ofVhyW and is denoted by V/W.
Theorem 10.15: Let W he a subspace of a vector space over a field K. Then the cosets of
WinV form a vector space over K with the following operations of addi
tion and scalar multiplication:
(i) {u + W) + {v + W) = {u + v) + W
(ii) kiu + W) = ku + W, where kGK.
We note that, in the proof of the above theorem, it is first necessary to show that the
operations are well defined; that is, whenever u + W = u' + W and v + W = v' + W, then
(i) {u + v) + W = (u' + V') + W and (ii) ku+W  ku' + W, for any k&K
230 CANONICAL FORMS [CHAP. 10
In the case of an invariant subspace, we have the following useful result.
Theorem 10.16: Suppose W is a subspace invariantunder a linear operator^ T : V » V.
Then T induces a linear operator f on V/W defined by T{v \W) =
T{v) + W. Moreover, if T is a zero of any polynomial, then so is T.
Thus the minimum polynomial of T divides the minimum polynomial of T.
Solved Problems
INVARIANT SUBSPACES
10.1. Suppose T:V ^V is linear. Show that each of the following is invariant under T:
(i) {0}, (ii) V, (iii) kernel of T, (iv) image of T.
(i) We have T(Q) = G {0}; hence {0} is invariant under T.
(ii) For every v G. V, T(v) € V; hence V is invariant under T.
(iii) Let u e Ker T. Then ^(m) = S Ker T since the kernel of T is a subspace of V. Thus
Ker T is invariant under T.
(iv) Since T{v) G Im T for every v eV, it is certainly true if v G Im T. Hence the image of
T is invariant under T.
10.2. Suppose {Wi} is a collection of Tinvariant subspaces of a vector space V. Show that
the intersection W= HiWi is also Tinvariant.
Suppose V eW; then v e Wi for every i. Since Wj is Tinvariant, T(v) G Wj for every i.
Thus tIv) €:W= riiWi and so W is Tinvariant.
10.3. Prove Theorem 10.2: Let T:V^V be any linear operator and let f{t) be any poly
nomial. Then the kernel of f{T) is invariant under T.
Suppose V G Ker/(r), i.e. f(T)(v) = 0. We need to show that ^(i;) also belongs to the kernel
of /(r), i.e. f(T)(T(v)) = 0. Since f{t)t=tf(t), we have f(T)T=Tf(T). Thus
f(T)T(v) = Tf(T){v) = T(0) =
as required.
10.4. Find all invariant subspaces of A  ( J viewed as an operator on R^.
First of all, we have that R^ and {0} are invariant under A. Now if A has any other invariant
subspaces, then it must be 1dimensional. However, the characteristic polynomial of A is
A(t) = \tIA\ =
t2 5
1 t + 2
= t2 + 1
Hence A has no eigenvalues (in R) and so A has no eigenvectors. But the 1dimensional invariant
subspaces correspond to the eigenvectors; thus R2 and {0} are the only subspaces invariant under A.
10.5. Prove Theorem 10.3: Suppose W is an invariant subspace of T:V^V. Then T
has a block diagonal matrix representation ( „ J where A is a matrix representa
tion of the restriction T of T to W. ^ ^
We choose a basis {wi, ....wj of W and extend it to a basis {w^, . . .,Wr,Vi v^} of V.
We have
CHAP. 10] CANONICAL FORMS 231
A.
A.
T{W2) = T(W2) = a2iWi + • • ■ + Oar^r
T{Wr) — r(Wr) = a^iWi + • • • + ftrr^r
^(i;!) = b^jWi + • ■ • + bi^w^ + CijWj + • • • + c^^v^
T{V<;^ = 621«'l + • • • + hr'^r + C21^1 + • • • + Cas^j
^^(■ys) = 6slWl + • • • + ftsr^r + CjiVi + • • • + Css^s
But the matrix of T in this basis is the transpose of the matrix of coefficients in the above system
of equations. (See page 150.) Therefore it has the form ( ) where A is the transpose of
C^
the matrix of coefficients for the obvious subsystem. By the same argument, A is the matrix of
T relative to the basis {Wj} of W.
10.6. Let T denote the restriction of an operator T to an invariant subspace W, i.e.
T{w) — T{w) for every w GW. Prove:
(i) For any polynomial f(t), /(f)(w) = fiT)(w).
(ii) The minimum polynomial of T divides the minimum polynomial of T.
(i) If /(*) = or if f{t) is a constant, i.e. of degree 1, then the result clearly holds. Assume
deg/ = n > 1 and that the result holds for polynomials of degree less than n. Suppose that
/{*) = a„t" + a„_i f»i + • • • + ajt + oo
Then f(T){w) = (a„r» + o„_ir"i + • • • + ao/)(w)
= (a„hi)(T(w)) + (a„_i rni + . . . + aoI)(w)
= (a„r»i)(r(w)) + (o„_ir«i + ■ • • + oo/)(w)
= fiTHw)
(ii) Let m(t) denote the minimum polynomial of T. Then by (i), m(T)(w) = m{T)(w) = 0(to) =
for every w S W; that is, T is a zero of the polynomial m(t). Hence the minimum polynomial
of T divides m{t).
INVARIANT DIRECTSUM DECOMPOSITIONS
10.7. Prove Theorem 10.4: Suppose Wi, . . .,Wr are subspaces of V and suppose, for
i = l, . . .,r, {wii, . . ., Wi„^} is a basis of Wu Then V is the direct sum of the Wi if
and only if the union
, . „ ,, B = {Wu, . . . , Win,, . . . , Wrl, . . ., Wm.)
IS a basis of V.
Suppose B is a basis of V. Then, for any v &V,
V = duWii + • • • + ai„jWi„j + ■ • • + a^iW^i + • • • + a^w^^ = Wi + W2+ ■•■ + w^
where Wj = aai^n + ■ • • + ai„.Wi„. G PTj. We next show that such a sum is unique. Suppose
V — w'l + W2 + ■ • ■ + w'r where W; S Wi
Since {wji, . . . , Win.} is a basis of Wi, w[ = b^Wn + • • • + 6i„.W{„. and so
V = 611W11 + • • ■ + 6i„jWi„^ + • • • + ftrl^rl + ■ • ■ + Kn^^m^
Since B is a basis of V, Oy = 6y, for each i and each j. Hence w^ = w,' and so the sum for v
is unique. Accordingly, V is the direct sum of the PFj.
Conversely, suppose V is the direct sum of the W^. Then for any v GV, v = Wj + • • • + w,
where Wj G PFj. Since {Wy.} is a basis of Wi, each w^ is a linear combination of the Wy. and so v
is a linear combination of the elements of B. Thus B spans V. We now show that B is linearly
independent. Suppose
"llWli + • • • + «!„ Win, + ■ ■ • + an^ri +
232 CANONICAL FORMS [CHAP. 10
Note that aa«'ii + • • • + ain.Wm. G W'j. We also have that = + 0++0 where G Wi Since
such a sum for is unique,
aaWti + • • ■ + ai„.Wi„. = for i = 1, . . . , r
The independence of the bases {wy.} imply that all the a's are 0. Thus B is linearly independent
and hence is a basis of V.
10.8. Suppose T:V^V is linear and suppose T = Ti © ^2 with respect to a Tinvariant
directsum decomposition V = U ®W. Show that:
(i) m{t) is the least common multiple of mi{t) and m2{t) where m{t), mi{t) and in2{t)
are the minimum polynomials of T, Ti and T2 respectively;
(ii) A{t)  Ai{t) A2{t), where A{t), Ai(t) and A2(t) are the characteristic polynomials of
T, Ti and T2 respectively.
(i) By Problem 10.6, each of TOi(t) and m^it) divides m(t). Now suppose f{t) is a multiple of both
Wi(() and m2(t); then f{Ti){U) = and /(r2)(W) = 0. Let vGV; then v  u + w with
M e C/ and w G W. Now
/(T) V = /(r) w + f(T) w = /(Ti) M + /(T2) w = + =
That is, r is a zero of f(t). Hence m(t) divides f{t), and so m{t) is the least common multiple of
Wi(t) and m2{t).
(ii) By Theorem 10.5, T has a matrix representation M = ( j where A and B are matrix
representations of T^ and T2 respectively. Then, by Problem 9.66,
tl  A
A(t) = \tIM\ =
as required.
tl  B
= \tIA\\tIB\ = Ai(t)A2(t)
10.9. Prove Theorem 10.7: Suppose T:V*V is linear, and suppose f{t) = g{t) h{t) are
polynomials such that /(T) = and g{t) and h{t) are relatively prime. Then V
is the direct sum of the Tinvariant subspaces U and W where C7 = Kerflr(r) and
W = Kerh{T).
Note first that U and W are Tinvariant by Theorem 10.2. Now since g(t) and h(t) are relatively
prime, there exist polynomials r(t) and s(t) such that
r(t) sr(t) + s{t) h(t) = 1
Hence for the operator T, r(T) g(T) + s{T) h(T) = I (*)
Let veV; then by (*), v = r(T) g(T) v + 8(T) h(T) v
But the first term in this sum belongs to W — KerhCT) since
h(T) r{T) g(T) V  r(T) g{T) h(T) v = r(T)f(T)v = r(T)(iv =
Similarly, the second term belongs to U. Hence V is the sum of U and W.
To prove that V = U ®W, we must show that a sum v — u + w with u&V, w &W, is
uniquely determined by 1;. Applying the operator r{T)g(T) to v = m + w and using g(T)u = 0,
we obtain ,„ __
r(r)ff(r)i> = r(T)g(T)u + r(r)fl'(r)w = r(r)s'(r)w
Also, applying (*) to w alone and using h{T) w = 0, we obtain
w = r(r)flr{r)w + 8(T)h(T)w = r(2')flr(T)w
Both of the above formulas give us w = t(T) g(T) v and so w is uniquely determined by v. Sim
ilarly M is uniquely determined by v. Hence V — V @W, as required.
CHAP. 10] CANONICAL FORMS 233
10.10. Prove Theorem 10.8: In Theorem 10.7 (Problem 10.9), if f{t) is the minimal poly
nomial of T (and g{t) and h{t) are monic), then g{t) is the minimal polynomial of the
restriction Ti of T to U and h(t) is the minimal polynomial of the restriction Tz of
rto W.
Let mi(t) and mgCf) be the minimal polynomials of T^ and T2 respectively. Note that 9(Tj) =
and h(T2) = because U = Ker g(T) and W = Kerh(T). Thus
mi(t) divides g(t) and m2{t) divides h{t) (1)
By Problem 10.9, f{t) is the least common multiple of mi(t) and nizit). But mi{t) and m2(t) are
relatively prime since g{t) and h{t) are relatively prime. Accordingly, f(t) = mj(t) m,2(t). We also
have that f{t) — g(t) h(t). These two equations together with (1) and the fact that all the polynomials
are monic, imply that g(t) — •mi(t) and h{t) = m^^t), as required.
10.11. Prove the Primary Decomposition Theorem 10.6: Let T : 7 » F be a linear operator
with minimal polynomial
mit) = /l(i)"i/2(i^.../r(<)"'
where the fi{t) are distinct monic irreducible polynomials. Then V is the direct sum
of Tinvariant subspaces Wi, ...,Wr where Wi is the kernel of fi{TY\ Moreover,
/i(i)"' is the minimal polynomial of the restriction of T to Wu
The proof is by induction on r. The case r = 1 is trivial. Suppose that the theorem has been
proved for r — 1. By Theorem 10.7 we can write V as the direct sum of Tinvariant subspaces W^
and Fi where W^ is the kernel of /i(r)"i and where V^ is the kernel of fziT)"^ . . . /r(r)"'. By
Theorem 10.8, the minimal polynomial of the restrictions of T to TFj and Vi are respectively /i(f)"i
and/2(«)"2 ... /r («)"'.
Denote the restriction of T to V^ by T^. By the inductive hypothesis, V^ is the direct sum of
subspaces W2, ■ ■ .,'W^ such that 'W^ is the kernel of /{(Ti)". and such that /((<)»! is the minimal poly
nomial for the restriction of T^ to PT,. But the kernel of fi{T)"i, for i = 2, . . .,r is necessarily
contained in V^ since /;(*)"' divides /2(t)"2 . . . /r(*)"'' Thus the kernel of /i(r)»i is the same as the
kernel of fi{T^^i, which is W^. Also, the restriction of T to W^ is the same as the restriction of T^
to Wi (for i = 2, . . .,r); hence /;(*)"» is also the minimal polynomial for the restriction of T to Wi
Thus V = Wi®W2® ■■■ ®Wr is the desired decomposition of T.
10.12. Prove Theorem 10.9: A linear operator T.V^V has a diagonal matrix representa
tion if and only if its minimal polynomial m{t) is a product of distinct linear
polynomials.
Suppose m{t) is a product of distinct linear polynomials; say,
m{t) = (tXi){tX2) ... (tX,.)
where the Xj are distinct scalars. By the primary decomposition theorem, V is the direct sum of
subspaces Wi,...,Wr where Wj = Ker(7'Xi/). Thus ii v e Wi, then (T\iI){v) = or
T(v) — Xjy. In other words, every vector in TFj is an eigenvector belonging to the eigenvalue Xj. By
Theorem 10.4, the union of bases for Wi, . . ., W^ is a basis of V. This basis consists of eigenvectors
and so T is diagonalizable.
Conversely, suppose T is diagonalizable, i.e. V has a basis consisting of eigenvectors of T. Let
Xj, . . ., Xj be the distinct eigenvalues of T. Then the operator
f(T) = {T\J)(TX2l)...(TKl)
maps each basis vector into 0. Thus f{T) = and hence the minimum polynomial m(() of T divides
the polynomial
m = (tXi)(iX2)...(tX,/)
Accordingly, m,(t) is a product of distinct linear polynomials.
234
CANONICAL FORMS
[CHAP. 10
NILPOTENT OPERATORS, JORDAN CANONICAL FORM
10.13. Let T:V^V be linear. Suppose, for vGV, T''{v) = but f'^v) ¥ 0. Prove:
(i) The set S = {v, T{v), ..., T'^^iv)} is linearly independent.
(ii) The subspace W generated by S is Tinvariant.
(iii) The restriction T of T to W is nilpotent of index k.
(iv) Relative to the basis {T''^{v), . . .,T{v),v} of W, the matrix of T is of the form
1
..
.
1 ..
.
..
.
1
..
.
Hence the above /csquare matrix is nilpotent of index k.
(i) Suppose
av + di T{v) + 02 T^v) +
+ a^.^nHv)
(*)
Applying T'^i to (*) and using r'=(i;) = 0, we obtain aT'<^i(v) = 0; since Ti''^(v) ^ 0, a  0.
Now applying T^z to (*) and using P'iv) = and a = 0, we find a^ r'=i(i;) = 0; hence
«! = 0. Next applying T''^ to (*) and using T<^(v) = and a = ai = 0, we obtain
a2T^~^{v) = 0; hence Og = 0. Continuing this process, we find that all the a's are 0; hence
S is independent.
(ii) Let veW. Then
V = bv + biT(v) + biT^v) + ••■ + b^_iT'^Hv)
Using THv) = 0, we have that
T{v) = bT{v) + biT2(v) + ■•• + b^.^^T'^H'") ^ W
Thus W is Tinvariant.
(iii) By hypothesis T''{v) = 0. Hence, for i = k—1,
Tk{Ti(v)) = r'' + «(i;) =
That is, applying T'^ to each generator of W, we obtain 0; hence T'^ = and so T is nilpotent
of index at most fc. On the other hand, Tf^^v) = T''^v) ¥= 0; hence T is nilpotent of index
exactly fc.
(iv) For the basis {T'<'^v), Ti'^v), . . .,T{v),v} of W,
T(T^^(v)) = rk(i;) =
r(rfc3(^)) = r'=2(u)
T(T{y))
T(v)
Hence the matrix of T in this basis is
T'^(v)
T(v)
1
.
.
1 .
.
.
.
1
.
.
CHAP. 10] CANONICAL FORMS 235
10.14. Let T.V^V be linear. Let U = KerT' and W = KerT+\ Show that (i) UcW,
(ii) T{W) C U.
(i) Suppose ueU = Kern Then THu) = and so T'+Mm) = T(,Ti(u)) = T(0) = 0. Thus
MGKerr* + i = W. But this is true for every m G f/; hence UcW.
(ii) Similarly, if w G W' = Ker r*+i, then T'+Mw) = 0. Thus r'+Mw) = r*(r(w)) = r«(0) =
and so r(W') c U.
10.15. Let r : F ^ F be linear. Let X = Ker r*^ Y = Ker 7*^ and Z = Ker T*. By the
preceding problem, XcY cZ. Suppose
{Ml, .... Mr}, {Ml, . . . , Mr, Vi, . . . , Vs}, {Mi, . . . ,Ur, Vi, . . . , Vs, Wi, . . . , Wt}
are bases of X, Y and Z respectively. Show that
s = {Ml, . . ., Mr, r(wi), . . ., r(M;t)}
is contained in Y and is linearly independent.
By the preceding problem, T(Z) c Y and hence S CY. Now suppose S is linearly dependent.
Then there exists a relation
aiUi + • • • + a^Mr + 6i T(wi) + ■■■ + b^ T{wt) =
where at least one coefficient is not zero. Furthermore, since {u^} is independent, at least one of the
6fc must be nonzero. Transposing, we find
bi T{wi) + • • • + 6t T(wt) =  aiUi  ■•■  a^u^ e X = Ker P'^
Hence Ti^(biT(wi) + • ■ ■ + btT(wt)) =
Thus r>(6iWi + ■•• + 6tWt) = and so 5iWi + • • • + 6,Wt G r = KerT*!
Since {mj, Vj} generates Y, we obtain a relation among the Mj, i»j and Wj; where one of the coefficients,
i.e. one of the 6^, is not zero. This contradicts the fact that {Mj, Vj, w^} is independent. Hence S
must also be independent.
10.16. Prove Theorem 10.10: Let T.V^V be a nilpotent operator of index k. Then T
has a block diagonal matrix representation whose diagonal entries are of the form
N
There is at least one N of order k and all other N are of orders ^ k. The number of
N of each possible order is uniquely determined by T. Moreover, the total number of
N of all orders is the nullity of T.
Suppose dimy = n. Let Wi = Ker T, W2 = Ker ra W^ = Ker T". Set m^ = dim W^j,
for i — 1,.. .,k. Since T is of index k, W^ = V and Wj^i # V and so m^_i <m^ — n. By
Problem 10.17,
WiCW^C ••• CW^ = V
Thus, by induction, we can choose a basis {mj, . . .,m„} of V such that {u^, . . .,«„ > is a basis of PFj.
We now choose a new basis for V with respect to which T has the desired form. It will be con
venient to label the members of this new basis by pairs of indices. We begin by setting
v{l,k) = u^^_^ + i, ■w(2, fc) =M„^_j + 2, ..., •y(mfc«tfc_i, fc) =M„j^
1
.
.
1 .
.
.
.
1
.
.
236 CANONICAL FORMS [CHAP. 10
and setting
v{l,kl) = Tv{l,k), v{2,kl) = Tv(2,k), ..., vim^m^.i, k1) ^ Tv{m^m^^i,k)
By the preceding problem,
Si = {Ml ..., u^^_^, v{l,kl), ..., vCmfcmfc^i, fe1)}
is a linearly independent subset of W^i We extend Sj to a basis of Wfci by adjoining new ele
ments (if necessary) which we denote by
■y(mfem;,_i + l, fc1), v(m^mki + 2, k~l), ..., v(m^_i m^^^tkV)
Next we set
v(l, k2) = Tv(l, k  1), v(2, k2) = Tv{2, k1), ...,
v(in^_i  mfc_2. k2) = Tv(m^_i  m^.g, fc  1)
Again by the preceding problem,
Si = {Ml, ..., u^^_^, v{l,k2), ..., ■u(mfc_iw^^2. ^^2)}
is a linearly independent subset of Wc2 which we can extend to a basis of TFfc_2 by adjoining
elements
vim^.i 711^2 + 1, k2), ■y(mfc_imfc_2+2, fc2), ..., vim^^^inks, ^Z)
Continuing in this manner we get a new basis for V which for convenient reference we arrange
as follows:
v{l, k), ..., 'y(mfc — ■TOfc_i, A;)
v{l,kl), ..., v{m^mk_i,kl), ..., •u(mfc_i  Wfca, fe  1)
i;(l, 2), ..., v(mfcmfe_i, 2), ..., •u(mfc_i  ■mfc_2, 2), ..., ^(mami, 2)
v(l, 1), ..., i;(mfcmfc_i, 1), ..., •u(mfe_i mfe_2, 1), ..., i;(m2mi, 1), ..., v(mi, 1)
The bottom row forms a basis of Wi, the bottom two rows form a basis of W2, etc. But what is
important for us is that T maps each vector into the vector immediately below it in the table or into
if the vector is in the bottom row. That is,
(v{i, j — 1) for } > 1
Tv(i,i) = ^ . ■ .
for ) = 1
Now it is clear (see Problem 10.13(iv)) that T will have the desired form if the v{i,]) are ordered
lexicographically: beginning with v(l, 1) and moving up the first column to ^(l, k), then jumping to
v{2, 1) and moving up the second column as far as possible, etc.
Moreover, there will be exactly
m^, — mfc_i diagonal entries of order k
(mfc_i — m;;_2) — (m^ — mfc_i) = 2mfc_i — m^ — Wfc_2 diagonal entries of order fc — 1
2m2 — mi — m^ diagonal entries of order 2
2mi — 7^2 diagonal entries of order 1
as can be read off directly from the table. In particular, since the numbers mj, . . . , m^ are uniquely
determined by T, the number of diagonal entries of each order is uniquely determined by T. Finally,
the identity
■mi = (mfcmfc_i) + (2mfc_i m^ TOfca) + ••• + (2m2mim3) + (2mim2)
shows that the nullity mj of T is the total number of diagonal entries of T.
1 1 1\ /O 1 1 1
0011l\ /ooooo
10.17. Let A =
and A3 = 0;
Then A^ =
00000/ looooo
^00000/ \0 0/
hence A is nilpotent of index 2. Find the nilpotent matrix M in canonical form
which is similar to A.
CHAP. 10]
CANONICAL FORMS
237
Since A is nilpotent of index 2, M contains a diagonal block of order 2 and none greater than
2. Note that rank A  2; hence nullity of A = 5  2 = 3. Thus M contains 3 diagonal blocks.
Accordingly M must contain 2 diagonal blocks of order 2 and 1 of order 1; that is,
1 j
_0_^j^_0^
M= OOlOllo
_0_^ '_0
I
10.18. Prove Theorem 10.11, page 226, on the Jordan canonical form for an operator T.
By the primary decomposition theorem, T is decomposable into operators T^, . . ., T^, i.e.
r = Ti © • • • © r^, where (t — Xj)"»i is the minimal polynomial of Tj. Thus in particular,
(Ti  Xi/)"«, = 0, ..., (T^\J)r«r = (i
Set Ni = Ti Xj7. Then for i = l r,
Ti = Ni+ \I, where Nr't =
That is, Tj is the sum of the scalar operator Xj/ and a nilpotent operator iV{, which is of index mj
since (t — Xj)*"! is the minimal polynomial of Tf.
Now by Theorem 10.10 on nilpotent operators, we can choose a basis so that iVj is in canonical
form. In this basis, Ti = N^ + \I is represented by a block diagonal matrix Mj whose diagonal
entries are the matrices J^. The direct sum J of the matrices Mj is in Jordan canonical form and,
by Theorem 10.5, is a matrix representation of T,
Lastly we must show that the blocks Jy satisfy the required properties. Property (i) follows
from the fact that A^j is of index mj. Property (ii) is true since T and J have the same character
istic polynomial. Property (iii) is true since the nullity of Ni= Tj — \I is equal to the geometric
multiplicity of the eigenvalue Xj. Property (iv) follows from the fact that the Tj and hence the N^
are uniquely determined by T.
10.19. Determine all possible Jordan canonical forms for a linear operator T:V >V whose
characteristic polynomial is A{t) — (t — 2)^(t — 5)^.
Since t — 2 has exponent 3 in A(t), 2 must appear three times on the main diagonal. Similarly
5 must appear twice. Thus the possible Jordan canonical forms are
2 11
2 1
2
(i)
5 1
5
2 I
1
2 I
!_■
(ii)
5 1
(iii)
5 1
5
2 1
2 1
2
2 1 I
2,'
L?.i
I 5
5 1
— (
I 5
(iv)
(V)
(vi)
238
CANONICAL FORMS
[CHAP. 10
10.20. Determine all possible Jordan canonical forms / for a matrix of order 5 whose
minimal polynomial is in(t) — {t — 2y.
J must have one Jordan block of order 2 and the others must be of order 2 or 1. Thus there
are only two possibilities:
2 1 I
I
2 1 I
J =
.i^_.
I 2 1 I
' 2
i ^ I
!__ + __
I 2
Note that all the diagonal entries must be 2 since 2 is the only eigenvalue.
QUOTIENT SPACE AND TRIANGULAR FORM
10.21. Let W he a subspace of a vector space V. Show that the following are equivalent:
(i) uGv + W, (ii) uv GW, (iii) v Gu + W.
Suppose uG v + W. Then there exists w^eW such that u = v + Wq. Hence u — v = WoSW.
Conversely, suppose u — vGW. Then u — v = Wq where Wq S 1^. Hence u = v + WgSv + W.
Thus (i) and (ii) are equivalent.
We also have: u — vGW iff — (m — v)=v — uGW iflf v & u+ W. Thus (ii) and (iii) are
also equivalent.
10.22. Prove: The cosets of PF in V partition V into mutually disjoint sets. That is:
(i) any two cosets u + W and v + W are either identical or disjoint; and
(ii) each v gV belongs to a coset; in fact, v Gv + W.
Furthermore, u + W — v + W if and only if u — vGW, and so {v + w) + W = v + W
for any w GW.
Let 1) e V. Since G W, we have v = v + E v + W which proves (ii).
Now suppose the cosets u+W and v + W are not disjoint; say, the vector « belongs to both
u+W and v + W. Then u — xGW and x — vGW. The proof of (i) is complete if we show that
u + W = v + W. Let M + Wq be any element in the coset u+W. Since u — x, x — v and Wq belong
to W,
(u + Wq) — V = (u — x) + {x — v) + Wo S W
Thus u + W(,Gv + W and hence the coset u+W is contained in the coset v + W. Similarly v + W
is contained in u+ W and so u+ W = v + W.
The last statement follows from the fact that u+W  v + W if and only if uGv + W, and
by the preceding problem this is equivalent to u — v G W.
10.23. Let W be the solution space of the homo
geneous equation 2x + By + 4:Z = 0. De
scribe the cosets of W in R^.
TF is a plane through the origin O = (0, 0, 0),
and the cosets of W are the planes parallel to W.
Equivalently, the cosets of W are the solution sets
of the family of equations
2x + Sy + 4z = k, kGR
In particular the coset v + W, where v = (a, b, c),
is the solution set of the linear equation
2x + Sy + Az = 2a + 36 + 4c
or 2(x a) + 3(y  6) + 4(2  c) =
CHAP. 10] CANONICAL FORMS 239
10.24. Suppose W is a subspace of a vector space V. Show that the operations in Theorem
10.15, page 229, are well defined; namely, show that if u + W = u' + W and v + W 
v' + W, then
(i) {u + v) + W = {u' + V') + W and (ii) ku + W = ku' + W, for any k&K
(i) Since u + W ^ u' + W and v + W = v' + W, both u — u' and v — v' belong to W. But then
(u + v)  (u' + v')  {u u') + {v v') e W. Hence (u + v) + W = (m' + v') + W.
(ii) Also, since u — u' S W implies k(u — u') G W, then ku — ku' = k(u — u') G W; hence
ku+W = ku' + W.
10.25. Let F be a vector space and W a subspace of V. Show that the natural map ij : F » V/W,
defined by rj{v) = v + W, is linear.
For any u,v ^ V and any k G K, we have
v{u + v) = u + V + W  u + W + V + W = v{u) + v{v)
and v{kv) = kv + W = k(v + W) = k ri(v)
Accordingly, r) is linear.
10.26. Let W he a subspace of a vector space V. Suppose {wi, . . . , Wr} is a basis of W and
the set of cosets {vi, . . . , Vs}, where Vj = Vj + W, is a basis of the quotient space.
Show that B = {vi, . . .,Vs, Wi, . . . , Wr} is a basis of V. Thus dim V = dim W +
dim (7/TF).
Suppose M e y. Since {■5^} is a basis of V/W,
u = u + W — di'i'i + a.2'U2 + • • ■ + ttj^s
Hence u — aiVy + • • • + a^v^ + w where w G W. Since {w;} is a basis of W,
u — a^Vi + • • • + a^Vg + bjWi + • • ■ + b^w^
Accordingly, B generates V.
We now show that B is linearly independent. Suppose
e^Vi + • ■ ■ + CgVs + djWi + • • • + dfWr = (1)
Then Cj'Di + ■ • • + c^Vs  = W
Since {Vj} is independent, the c's are all 0. Substituting into (1), we find djWi + ■ • • + d^w^ = 0.
Since {wj is independent, the d's are all 0. Thus B is linearly independent and therefore a basis
of y.
10.27. Prove Theorem 10.16: Suppose W is a subspace invariant under a linear operator
T:V^V. Then T induces a linear operator f on V/W defined by f{v + PF) =
T{v) + W. Moreover, if T is a zero of any polynomial, then so is T. Thus the mini
mum polynomial of T divides the minimum polynomial of T.
We first show that f is well defined, i.e. if u+W = v + W then t(u+W) = f(v + W). If
u+W = v + W then uvGW and, since W is Tinvariant, T(u v) = T(u)  T{v) G W.
Accordingly,
T{u+W) = T{u) + W = T(v) + W = T(v + W)
as required.
We next show that t is linear. We have
t{{u +W) + (v + W)) = f(u + v + W) = T(u + v) + W = T(u) + Tiv) + W
= T(u) + W+ T(v) + W = f{u + W) + T(v + W)
and
f{k{u + W)) = f(ku + W) = T{ku) + W= kT(u) + W = k{T(u) + W) = kf(u+ W)
Thus f is linear.
240 CANONICAL FORMS [CHAP. 10
Now, for any coset m + IF in VIW,
f2(u+W) = THu) + W = T(T(u)) + W = t{T{u) + W) ^ t(f{u+W)) = t^u+W)
Hence T^ — T^. Similarly T" = T" for any n. Thus for any polynomial
/(«) = a„t" + • • • + ao = 2 afi,
HT)(u+W) = f(T)(u) + W = '^.a^Tiiu) + W = "2 diiTKiA + '^)
= ^ajFCw+W) = ^aifi(u+W) = {^ a.ifi)(u + W) = f{f)(u+W)
and so 7(r) = /(r). Accordingly, if T is a root of f{t) then 7(f) = = W = /(f), i.e. f is also a
root of f(t). Thus the theorem is proved.
10.28. Prove Theorem 10.1: Let T .V ^V be a linear operator whose characteristic poly
nomial factors into linear polynomials. Then V has a basis in which T is represented
by a triangular matrix.
The proof is by induction on the dimension of V. If dim V = 1, then every matrix representa
tion of r is a 1 by 1 matrix which is triangular.
Now suppose dim V — n> 1 and that the theorem holds for spaces of dimension less than n.
Since the characteristic polynomial of T factors into linear polynomials, T has at least one eigen
value and so at least one nonzero eigenvector v, say T(v) — a^^v. Let W be the 1dimensional sub
space spanned by v. Set V = VIW. Then (Problem 10.26) dim V = dim V — dim W = to  1. Note
also that W is invariant under T. By Theorem 10.16, T induces a linear operator f on V whose
minimum polynomial divides the minimum polynomial of T. Since the characteristic polynomial of
r is a product of linear polynomials, so is its minimum polynomial; hence so are the minimum
and characteristic polynomials of f. Thus V and f satisfy the hypothesis of the theorem. Hence,
by induction, there exists a basis {v^, . . . , ■0„} of V such that
f(vs) = as2.V2 + assVs
Now let V2, ■ ■ tVn be elements of V which belong to the cosets 1)2, . . . , «„ respectively. Then
{v,V2, ...,vj is a basis of V (Problem 10.26). Since f(v2)  a22'V2, we have
f («2) — a'22^2 = and so 2X^2) "" «22'"2 ^ ^
But W is spanned by v; hence T{v2) — a22'''2 is a multiple of v, say
T(V2) — a'22V2 — a2i'" and so T^kv^) — "21^ + «22'"2
Similarly, for t = 3 n,
T{Vi)  ai2V2  Ojsys  • • •  a^Vi e W and so T{Vi) = a^v + 0(2^2 !•••+ a^iV^
Thus T(v) = a,iv
T{V2) = a2iV + 0.22^2
T(Vn) = a„iV + a„2«2 + ■ • • + annyn
and hence the matrix of T in this basis is triangular.
CYCLIC SUBSPACES, RATIONAL CANONICAL FORM
10.29. Prove Theorem 10.12: Let Z{v, T) be a Tcyclic subspace, T^ the restriction of T to
Z{v,T), and m„(«) ^ t^ + 0.^1*"' + ■ ■ ■ + Oo the Tannihilator of v. Then:
(i) The set {v, T{v), ..., r'=i(v)} is a basis of Z{v,T); hence dimZ(t;,r) = k.
(ii) The minimal polynomial of Tv is TO„(f).
(iii) The matrix of T« in the above basis is
CHAP. 10] CANONICAL FORMS 241
.
.
— fto
1
.
.
ai
.
.
— afc2
.
. 1
— Ctlcl
(i) By definition of m^Ct), T''{v) is the first vector in the sequence v, T{v), T^v), . . . which is a
linear combination of those vectors which precede it in the sequence; hence the set
B — {v, T{v), . . ., r''i(i;)} is linearly independent. We now only have to show that Z(v, T) =
L(B), the linear span of B. By the above, T^v) e L{B). We prove by induction that
T^{v) &L(B) for every n. Suppose n>k and T^^(v) E. L(B), i.e. !r"i(v) is a linear com
bination of V, . ..,T^i{v). Then r"(v) = r(r«i(v)) is a linear combination of T{v), . . tT^v).
But THv) G L{B); hence T^{v) £ L(B) for every n. Consequently f(T)(v) G L(B) for any
polynomial /(<). Thus Z{v, T) = L(B) and so B is a basis as claimed.
(ii) Suppose m(t) = i* + 6j_i<«~i + • • • + &o is the minimal polynomial of r„. Then, since
■we (v, ), ^ ^ m(T^){v) = m(T){v) = T^(v) + h^^iT^^(v) + • • ■ + h^v
Thus T^{v) is a linear combination of v,T{v), . . ., T^i{v), and therefore k ^ s. However,
m„(r) = and so m^(T^) = 0. Then m(t) divides m„(t) and so s — k. Accordingly fc = s and
hence w„(t) = tn{t).
(iii) T„(v) = T(v)
T„{T{v)) = THv)
T„{T''Hv)) = T^v) = oov  a^T{v)  a^n(v) a^^^n'Hv)
By definition, the matrix of T„ in this basis is the transpose of the matrix of coefficients of the
above system of equations; hence it is C, as required.
10.30. Let r : y » y be linear. Let TF be a Tinvariant subspace of V and T the induced
operator on VIW. Prove: (i) The Tannihilator of v G V divides the minimal poly
nomial of T. (ii) The Tannihilator of v G VIW divides the minimal polynomial of T.
(i) The rannihilator of v GV is the minimal polynomial of the restriction of T to Z(v, T) and
therefore, by Problem 10.6, it divides the minimal polynomial of T.
(ii) The fannihilator of ■p S VIW divides the minimal polynomial of f, which divides the minimal
polynomial of T by Theorem 10.16.
Remark. In case the minimal polynomial of T is /(t)" where f(t) is a monic irreducible poly
nomial, then the Tannihilator of v G V and the Tannihilator of i) G VIW are of the form /(t)™
where m — n.
10.31. Prove Lemma 10.13: Let T : F * V be a linear operator whose minimal polynomial
is /(t)" where f{t) is a monic irreducible polynomial. Then V is the direct sum of
Tcyclic subspaces Zi = Z{vi, T), i = l, . . ., r, with corresponding Tannihilators
/(f)"i, /{f)% . . . , /(i)"', n = ni ^ »2  • • •  «r
Any other decomposition of Y into the direct sum of Tcyclic subspaces has the same
number of components and the same set of Tannihilators.
The proof is by induction on the dimension of V. If dim V = 1, then V is itself Tcyclic and
the lemma holds. Now suppose dim V > 1 and that the lemma holds for those vector spaces of
dimension less than that of V.
242 CANONICAL FORMS [CHAP. 10
Since the minimal polynomial of T is /(<)", there exists v^GV such that f{T)^~i(vi) ¥= 0;
hence the rannihilator of Vi is /(«)". Let Zi = Z{vi,T) and recall that Zi is Tinvariant. Let
V = V/Zi and let f be the linear operator on V induced by T. By Theorem 10.16, the minimal poly
nomial of f divides /(<)"; hence the hypothesis holds for V and f. Consequently, by induction, V is
the direct sum of fcyclic subspaces; say,
V = Z(%, r) e • • • © Z(v„ f)
where the corresponding fannihilators are f{t)"2, . . ., /(*)">■, ti — n2 — • • • — n^
We claim that there is a vector V2 in the coset V2 whose Tannihilator is /(<)"2, the Tannihilator
of V2. Let w be any vector in Dg Then f{T)"i (w) e Z^. Hence there exists a polynomial g(t) for
which
f(T)n.{w) = 9{T){Vi) (1)
Since /(<)" is the minimal polynomial of T, we have by (1),
= f{T)^{w) = f(T)«n2g{T)(vi)
But /(«)" is the rannihilator of 1^1 ; hence /(i)" divides f(t)«''2 g(t) and so g{t) = f(t)«2 h(t) for
some polynomial h(t). We set , ,„r^ , >
■U2 = w  h{T) (vi)
Since w — ■Uj = 'i(^) (^1) ^ ^1. ^2 also belongs to the coset ■82 Thus the Tannihilator of V2 is a
multiple of the ifannihilator of V2 On the other hand, by (1),
f{T)«^{v2) = f(T)'H(wh(T)(vi)) = f(T)^(w)  g{T){vi) =
Consequently the Tannihilator of v^ is /(t)"2 as claimed.
Similarly, there exist vectors v^, . . .,Vf&V such that ViGvl and that the Tannihilator of
Vj is /(t)"i, the fannihilator of ^. We set
Z2 = Z{V2,T), ..., Z, = Z(i;„r)
Let d denote the degree of /(t) so that /(*)"« has degree dwj. Then since /(t)"! is both the rannihilator
of Vi and the fannihilator of v^, we know that
{Vi, T(v,), ..., T*"i 1 (Vi)} and {% f(v^, . . . , f d^ii 1 (iTj)}
are bases for Z(Vi, T) and Z(v^, f) respectively, for i = 2,...,r. But V = Z(v^ f) © ••• ©
Z(vZ, t); hence . .s^ . ,,
is a basis for V. Therefore by Problem 10.26 and the relation fi(v) = THv) (see Problem 10.27),
{Vi, ..., rd"il(Vi), V2, .... Tdr^^HV2), ...,V„ ..., Tdrl (i;^)}
is a basis for V. Thus by Theorem 10.4, V = Z(vi, T) ® •■• © Z(v^, T), as required.
It remains to show that the exponents Wj, . . . , w^ are uniquely determined by T. Since d denotes
the degree of fit),
dimy = d{nx^ h n^) and dimZj = drii, i = l,...,r
Also, if s is any positive integer then (Problem 10.59) f(T)^(Z>i is a cyclic subspace generated by
f{T)s(Vi) and it has dimension d(Mjs) if «i > s and dimension if Wi  s.
Now any vector v &V can be written uniquely in the form v = w^+ • • • + Wr where w^ £ Zj.
Hence any vector in f(T)^{V) can be written uniquely in the form
f(T)Hv) = f{Ty(Wi)+ ■■■ +f(T)s(w,)
where /(r)«(Wj) € f(Ty{Zi). Let t be the integer, dependent on s, for which
%1 > S, ..., Jlt > S, Wt + i  s
Then f(Ty{V) = f(T)HZi) © ••• © /(DM^t)
and so dim(/(r)'(V)) = d[{ni  s) + • ■ ■ + {n^  s)] (*)
The numbers on the left of (*) are uniquely determined by T. Set s = re — 1 and (*) determines the
number of TOj equal to re. Next set s = re — 2 and (*) determines the number of re, (if any) equal to
n1. We repeat the process until we set s = and determine the number of % equal to 1. Thus
the Wi are uniquely determined by T and V, and the lemma is proved.
CHAP. 10]
CANONICAL FORMS
243
10.32. Let F be a vector space of dimension 7 over R, and let T.V^V be a linear operator
with minimal polynomial m{t) = {t^ + 2){t + 3f. Find all the possible rational
canonical forms for T.
The sum of the degrees of the companion matrices must add up to 7. Also, one companion
matrix must be t^ + 2 and one must be (t + 3)3. Thus the rational canonical form of T is exactly one
of the following direct sums of companion matrices:
(i) C(t2 + 2) © C(t2 + 2) © C((t + 3)3)
(ii) C(«2 + 2) © C((t + 3)3) © C((t + 3)2)
(iii) C(t2 + 2) © C((« + 3)3) © C(t + 3) © C(t + S)
That is,
^0 2
(i)
2
\
/°
2
\
A
27
27
1
27
1
27
1 9
9 /
1 6/
V
1 9
3
3/
(ii)
(iii)
PROJECTIONS
10.33. Suppose V = Wi® • • • ® Wr. The projection of V into its subspace Wk is the map
ping E:V ^ V defined by E{v) = Wk where v — wi+ ■ ■ • + Wr, Wi e Wi. Show
that (i) E is linear, (ii) E^ = E.
(i) Since the sum v = Wi+ ■ ■ ■ + w^, WiG W is uniquely determined by v, the mapping E is well
defined. Suppose, for m S V, u ~ w^i • ■ ■ + w^, w[ S W^. Then
V { u = (wi + w() + • • • + (Wr + w'r) and A:t; = fcwj + • • • + kw^, kwf, Wj + w,' G PFj
are the unique sums corresponding to v + u and kv. Hence
E(v + u) = Wk + wl^ = E{v) + E(u) and E(kv) = kw^ = kE(v)
and therefore £7 is linear.
(ii) We have that w^ = 0++0 + TOfc + 0++0
is the unique sum corresponding to w^ G Wk', hence E(w^) = w^. Then for any v G V,
EHv) = E(E(v)) = E(Wk) = w^ = E(v)
Thus E^ = E, as required.
10.34. Suppose E:V*V is linear and E^  E. Show that: (i) E(u) = u for any uGlmE,
i.e. the restriction of E to its image is the identity mapping; (ii) V is the direct sum
of the image and kernel of E: V — ImE @ KerE; (iii) E is the projection of V into
Im E, its image. Thus, by the preceding problem, a linear mapping T : V > V is a
projection if and only if T^ = T; this characterization of a projection is frequently
used as its definition.
(i) If M G Im E, then there exists v e V for which E(v) = u; hence
E(u) = E{E{y)) = EHv) = E{v) = u
as required.
(ii) Let V SV. We can write v in the form v = E{v) + v — E(v). Now E(v) e.ImE and, since
E(v  ^(1;)) = E(v)  E^(v) = E{v)  E(v) =
V — E(v) G Ker E. Accordingly, V = Im jE? + Ker E.
244 CANONICAL FORMS [CHAP. 10
Now suppose w GImE n Ker E. By (i), E{w) = w because w G Im £/. On the other
hand, E{w) = because w G Ker E. Thus w = and so ImE n Ker E = {0}. These two
conditions imply that V is the direct sum of the image and kernel of E.
(iii) Let v eV and suppose v = u + w where uGlmE and w e Ker E. Note that E(u) = u
by (i), and E{w) — because w G Kerfi". Hence
£?(!;) = E{u + w) = E(u) + E{w) = u + = m
That is, E is the projection of V into its image.
10.35. Suppose V = U®W and suppose T:V^V is linear. Show that U and TF are both
rinvariant if and only if TE  ET where E is the projection of V into U.
Observe that E{v) G U for every t> G Y, and that (i) E(v) = v iff v€:U, (ii) K(i;) =
iff v&W.
Suppose ET = TE. Let u G U. Since E{u) = w,
r(M) = T{E(u)) = (TE)(u) = {ET){u) = E(T(u)) G U
Hence U is Tinvariant. Now let w GW. Since E{w) = 0,
E{T(w)) = (ET){w)  (TE){w) = T{E{w)) = r(0) = and so T(w) G W
Hence W is also Tinvariant.
Conversely, suppose U and W are both Tinvariant. Let ■« G Y and suppose •y = w + w where
M G r and w G T^. Then ?(«) G i7 and r(w) G W; hence S(r{w)) = TM and E(T(w)) = 0.
'^^'"^ (ET){v) = (Br)(M + TO) = (Er)(M) + (Br)(w) = E(T(u)) + E{T{w)) = T(u)
and {TE)(v) = (r£7)(M + w) = r(S(M + «;)) = T{u)
That is, (ET){v) = (TE)(v) for every y G F; therefore ET = TE as required.
Supplementary Problems
INVARIANT SUBSPACES
10.36. Suppose W is invariant under T:V ^V. Show that W is invariant under f{T) for any polynomial
10.37. Show that every subspace of V is invariant under I and 0, the identity and zero operators.
10.38. Suppose W is invariant under S:V^V and T : V ^ F. Show that W is also invariant under
S + r and ST.
10.39. Let r : y > y be linear and let W be the eigenspace belonging to an eigenvalue X of T. Show that
W is rinvariant.
10.40. Let y be a vector space of odd dimension (greater than 1) over the real field E. Show that any
linear operator on V has an invariant subspace other than V or {0}.
10.41. Determine the invariant subspaces of A = f _A viewed as a linear operator on (i) R2, (ii) C^.
10.42. Suppose dim V = n. Show that T:V^V has a triangular matrix representation if and only if
there exist Tinvariant subspaces WiCWzC ■ ■ ■ cW^ = V for which dim TFfc = fc, k = l,...,n.
INVARIANT DIRECTSUMS
10 43 The subspaces Wi,...,Wr are said to be independent if Wi + • • •  w^ = 0, Wj G Wi, implies that
each Wi = 0. Show that L(Wd = Wi® ■■■ @Wr if and only if the Wi are independent. (Here
LiWi) denotes the linear span of the Wi)
10.44. Show that V = Wi® ■•■ ®Wr if and only if (i) V = L{Wi) and (ii) W^nL{Wi, . . .,Wki.
W^ + i,...,Wr) = {0}, fe = l, ...,r.
10.45. Show that L(Wi) = W,®®Wr if and only if dimLd^i) = dim Wi + • • • t dim W^.
1 .
.
1 .
.
.
. 1
.
.
.
.
1 .
.
1 .
.
CHAP. 10] CANONICAL FORMS 245
10.46. Suppose the characteristic polynomial of T : V » V is A(t) = /i(f)"i /2(f)»2 . . . fAt)«' where the
fi(t) are distinct monic irreducible polynomials. Let V = WiQ ■■• ®Wr be the primary decom
position of V into rinvariant subspaces. Show that /((t)™! is the characteristic polynomial of the
restriction of 7" to PFj.
NILPOTENT OPERATORS
10.47. Suppose S and T are nilpotent operators which commute, i.e. ST = TS. Show that S+T and ST
are also nilpotent.
10.48. Suppose A is a supertriangular matrix, i.e. all entries on and below the main diagonal are 0. Show
that A is nilpotent.
10.49. Let V be the vector space of polynomials of degree  n. Show that the differential operator on V
is nilpotent of index n + 1.
10.50. Show that the following nilpotent matrices of order n are similar:
and
\0 ... 1 0/
10.51. Show that two nilpotent matrices of order 3 are similar if and only if they have the same index
of nilpotency. Show by example that the statement is not true for nilpotent matrices of order 4.
JORDAN CANONICAL FORM
10.52. Find all possible Jordan canonical forms for those matrices whose characteristic polynomial A(t)
and minimal polynomial m{t) are as follows:
(i) A(t) = (t2)4(f3)2, m{t) = (t2)2(t3)2
(ii) A{t) = (t7)5, m(t) = (t7)2
(iii) A(t) = (t2)7, m(t) = (t2)3
(iv) A(t) ^ (t3)*(t5)\ m(t) =: («3)2(i5)2
10.53. Show that every complex matrix is similar to its transpose. {Hint. Use Jordan canonical form and
Problem 10.50.)
10.54. Show that all complex matrices A of order n for which A" = / are similar.
10.55. Suppose A is a complex matrix with only real eigenvalues. Show that A is similar to a matrix with
only real entries.
CYCLIC SUBSPACES
10.56. Suppose T:V *V is linear. Prove that Z{v, T) is the intersection of all Tinvariant subspaces
containing v.
10.57. Let fit) and g{t) be the Tannihilators of u and v respectively. Show that if f(t) and g{,t) are rel
atively prime, then f(t)g(t) is the Tannihilator of u + v.
10.58. Prove that Z(m, T) = Z(v, T) if and only if g(T)(u) = v where g(f) is relatively prime to the
rannihilator of u.
10.59. Let W = Z{v, T), and suppose the Tannihilator of v is /(*)" where f(t) is a monic irreducible poly
nomial of degree d. Show that f{T)^{W) is a cyclic subspace generated by f(Ty(v) and it has dimen
sion d{n — s) if n > s and dimension if n — s.
RATIONAL CANONICAL FORM
10.60. Find all possible rational canonical forms for:
(i) 6X6 matrices with minimum polynomial m(t) = (t^ + 3){t + 1)2
(ii) 6X6 matrices with minimum polynomial mit) = (t + 1)3
(iii) 8 X 8 matrices with minimum polynomial m(t) = {t^ + 2)^(t + Z)^
10.61. Let A be a 4 X 4 matrix with minimum polynomial m(t) = (t^ + \){fi — 3). Find the rational ca
nonical form for A if A is a matrix over (i) the rational field Q, (ii) the real field B, (iii) the com
plex field C.
246 CANONICAL FORMS [CHAP. 10
10.62. Find the rational canonical form for the Jordan block
10.63. Prove that the characteristic polynomial of an operator T :V ^ V is a product of its elementary
divisors.
10.64. Prove that two 3X3 matrices with the same minimum and characteristic polynomials are similar.
10.65. Let C(f(t)) denote the companion matrix to an arbitrary polynomial f(t). Show that f(t) is the char
acteristic polynomial of C{f(t)).
PROJECTIONS
10.66. Suppose V = Wi® ■■■ ®Wr Let Ei denote the projection of V into Wi Prove: (i) EiE^ = 0,
i^j; (ii) I = E^+ •■■ +E^.
10.67. Let El, .. .,Er be linear operators on V such that: (i) Ef = £/;, i.e. the Ei are projections;
(ii) EiEj = 0, i ^ i; (iii) J = Bj H +E^. Prove that V = Im Ej © • • • © Im B^
10.68. Suppose E .V ^V is a projection, i.e. E^ = E. Prove that E has a matrix representation of the
form ( ^ ) where r is the rank of E and /^ is the rsquare identity matrix.
10.69. Prove that any two projections of the same rank are similar. {Hint. Use the result of Problem
10.68.)
10.70. Suppose E .V ^V is a projection. Prove:
(i) IE isa projection and V = ImE ® Im (IE); (ii) I + E is invertible (if 1 + 1 # 0).
QUOTIENT SPACES
10.71. Let IF be a subspace of V. Suppose the set of cosets {vi + W,V2 + W, ...,■«„+ IF} in V/W is
linearly independent. Show that the set of vectors {v^, V2, ..., vj in V is also linearly independent.
10.72. Let IF be a subspace of V. Suppose the set of vectors {mi, Wg m„} in V is linearly independent,
and that L(Ui) nW = {0}. Show that the set of cosets {mi + IF, . . . , m„ + IF} in V/W is also
linearly independent.
10.73. Suppose V = U ®W and that {mj, . . . , tt„} is a basis of U. Show that {ui + W, . . . , ii„ + IF} is
a basis of the quotient space V/W. (Observe that no condition is placed on the dimensionality of
V or IF.)
10.74. Let IF be the solution space of the linear equation
ajXi + 020:2 + • • • + an^n = 0. O'i^ K
and let v = (5i, 63 6„) S X". Prove that the coset v + IF of IF in K" is the solution set of the
linear equation , , , , i._„j,j^ .a. „ h
aiXi + a2X2 + • • • + a^Xn = b where 6  ajfti + • • • + a„6„
10.75. Let V be the vector space of polynomials over R and let IF be the subspace of polynomials divisible
by t*, i.e. of the form Ugt* + a^t^ \ h a„_^t^. Show that the quotient space V/W is of dimension 4.
10.76. Let U and IF be subspaces of V such that WcUcV. Note that any coset «, + IF of IF in ?7 may
also be viewed as a coset of IF in F since u&U implies w e V; hence U/W is a subset of V/W.
Prove that (i) ?7/IF is a subspace of V/W, (ii) dim {V/W)  dim(i7/IF) = dim{V/U}.
10.77. Let U and IF be subspaces of V. Show that the cosets of UnW in F can be obtained by inter
secting each of the cosets of t/ in F by each of the cosets of IF in V:
V/{UnW) = {{v+U)n{v'+W): v,v' eV}
10.78. Let T:V ^V be linear with kernel IF and image U. Show that
the quotient space V/W is isomorphic to U under the mapping
e : V/W > U defined by e{v I IF) = T{v). Furthermore, show that v
T = io 0OTJ where i; : F > V/W is the natural mapping of V into
V/W, i.e. r,{v) = y t IF, and t : U C V is the inclusion mapping, yjy^
i.e. i(u) = u. (See diagram.)
CHAP. 10]
CANONICAL FORMS
247
Answers to Supplementary Problems
10.41. (i) R2 and {0} (ii) C\ {0}, PTj = L((2, 1  2i)), W^ = L{{2, 1 + 2i)) •
10.52. (i)
2 1
2
2 1 I
2 I
r
3 1
3
2 1 I
4_.
L2_.
^2^
i_lL
i 3 1
I 3
(ii)
7 1 i
7 I
7 1'
' 7 I
L_i4
I 7
7 1 ;
7 I
111.
7 I
"[7
(iii)
2 1 I
2 I
2 1
2 1
H
I 2
2 1
2 1
2
2 1
I
I 2 1
I 2
2 1 I
2 1 I
2 I
r
I
2 1 I
2 I
111
I 2 /
2 1 I
2 1 I
2 I
2 I
TiL.
(iv)
3 1 I
Wi^
L.
\
I 5 1
I 5
3 1 I
3 I
r
3 1 I
1
5 1 I
I 5 /
3 1 I
3 I
r
[5 n
' 5 I
5 1
I 5
3 1 I
3 I
L  f  1
3 I
L"' _
r5""i"i
^in_
I 5 /
248
10.60. (i)
(ii)
(iii)
'0 3
1
'0 01
103
13
1
2
4
1
CANONICAL FORMS
3
1
1
1 2
1
1 2/
'0 01
1 03
13
9
1 6/
1
1 2
1,
2
4
1
[CHAP. 10
3
1
1
1 2
9
1 6
1
1
1
3
1 3
1
1,
2
1
1
4
1
1
9
6
3.
10.61. (i)
(ii)
/O 1
y/^
W
(iii)
V^
Vsl
10.62.
^0 x*\
10 4\3
10 6\2
\0 1 4X
chapter 11
Linear Functionals and the Dual Space
INTRODUCTION
In this chapter we study linear mappings from a vector space V into its field K of scalars.
(Unless otherwise stated or implied, we view X as a vector space over itself.) Naturally
all the theorems and results for arbitrary linear mappings on V hold for this special case.
However, we treat these mappings separately because of their fundamental importance and
because the special relationship of 7 to Z gives rise to new notions and results which do not
apply in the general case.
LINEAR FUNCTIONALS AND THE DUAL SPACE
Let F be a vector space over a field K. A mapping <i>:V * K is termed a linear func
tional (or linear form) if, for every u,v GV and every a,b G K,
4,{au ^hv) — a 4,{u) + b <j>{v)
In other words, a linear functional on F is a linear mapping from V into K.
Example 11.1: Let jtj : K" » K be the ith projection mapping, i.e. 7rj(ai, aj, ...,a„) = a^ Then ttj
is linear and so it is a linear functional on X".
Example 11.2: Let V be the vector space of polynomials in t over R. Let ^ iV ^ R be the integral
operator defined by ^{p(t)) = j p{t) dt. Recall that ^ is linear; and hence it is
a linear functional on y. o
Example 11.3: Let V be the vector space of nsquare matrices over K. Let T iV ^ K be the trace
mapping
T{A)  «„ + a22 + • • • + «««. where A = (ay)
That is, T assigns to a matrix A the sum of its diagonal elements. This map is
linear (Problem 11.27) and so it is a linear functional on V.
By Theorem 6.6, the set of linear functionals on a vector space V over a field K is also
a vector space over K with addition and scalar multiplication defined by
{(t> + (T){v) = (j>{v) + (t{v) and {!i<j>)iv) = k<j){v)
where ^ and a are linear functionals on V and k G K. This space is called the diuil space of
V and is denoted by V*.
Example 11.4: Let V = K", the vector space of tituples which we write as column vectors. Then
the dual space V* can be identified with the space of row vectors. In particular,
any linear functional <f> = {a^, . . . , a„) in y* has the representation
<p{xi, . . .,Xn) = (tti.aa, . . .,a„)
249
250 LINEAR FUNCTIONALS AND THE DUAL SPACE [CHAP. 11
or simply
0(a;i «„) = a^Xi + 02*2 + • • • + In^n
Historically, the above formal expression was termed a linear form.
DUAL BASIS
Suppose y is a vector space of dimension n over K. By Theorem 6.7, the dimension of
the dual space V* is also n (since K is of dimension 1 over itself.) In fact, each basis of V
determines a basis of V* as follows:
Theorem 11.1: Suppose {Vi, . . .,v„} is a basis of V over K. Let ^j, . . .,^„ G V* be the
linear functionals defined by
^'(^^^ ^ ^« = jo if^^i
Then {^i ^„} is a basis of V*.
The above basis {<>^) is termed the basis dtial to (Vi) or the dval basis. The above for
mula which uses the Kronecker delta Si, is a short way of writing
4>^(vJ = 1, <l>,{v^) = 0, <i>^(v^) = 0, . . ., .I>,ivj 
^^{v,) = 0, </,,(t;,) = 1, <l>,{v,) = 0, . . ., <l>,ivj 
By Theorem 6.2, these linear mappings ^, are unique and well defined.
Example Hii: Consider the following basis of R^: {v^ = (2,1), ■Uj  (3,1)}. Find the dual basis
{^i> ^2}'
We seek linear functionals ^i(a;, y) = ax + by and 02(*. y) = ex + dy such that
^i('i;i) = 1, 01(^2) = 0, 02(^1) = 0, 9*2(^2) = 1
Thus 0,(vi) = 0i(2, 1) = 2a + 6 = 11 ^^ a = 1, 6 = 3
0iK) = 0j(3,l) = 3a + 6 = Oj
02K) = 02(2,1) = 2c + d = Oj ^^ e = l,d = i
02(^2) = 02(3, 1) = 3c + d = ij
Hence the dual basis is {4>i(x, y) = x + Sy, 4>2(.x, y) = x  2y}.
The next theorems give relationships between bases and their duals.
Theorem 11.2: Let {vi, . . . , v™} be a basis of V and let {<l>^, ...,4>Jhe the dual basis of V*.
Then for any vector uGV,
u = 4,^{u)v^ + 4,^{u)v^ + • • • + <t>Su)v^
and, for any linear functional a € V*,
Theorem 11.3: Let {vi, ...,Vn} and {wi, ...,Wn} be bases of V and let {<^,, .. .,<#>„} and
{ffj, . . . , o„} be the bases of 7* dual to {Vi} and {Wi} respectively. Suppose
P is the transition matrix from [Vi) to {Wi}. Then (Pi)* is the transition
matrix from {4>^} to (o.j.
CHAP. 11] LINEAR FUNCTIONALS AND THE DUAL SPACE 251
SECOND DUAL SPACE
We repeat: every vector space V has a dual space F* which consists of all the linear
functionals on V. Thus V* itself has a dual space V**, called the second dual of V, which
consists of all the linear functionals on V*.
We now show that each v GV determines a specific element v GV**. First of all,
for any <j> GV* we define ^
vi<j>) = <l>iv)
It remains to be shown that this map v.V* ^ K is linear. For any scalars a,b GK and
any linear functionals ^, o G V*, we have
v(a(j> + ba) = (acf> + b(T)(v) — a <i>(v) + b a(v)  av{<j>) + bv{a)
That is, V is linear and so v GV**. The following theorem applies.
Theorem 11.4: If V has finite dimension, then the mapping v ^ v is an isomorphism of V
onto V**.
The above mapping v i^ t; is called the natural mapping of V into V**. We emphasize
that this mapping is never onto F** if 7 is not finitedimensional. However, it is always
linear and, moreover, it is always onetoone.
Now suppose V does have finite dimension. By the above theorem the natural mapping
determines an isomorphism between V and V**. Unless otherwise stated we shall identify
V with V** by this mapping. Accordingly we shall view V as the space of linear functionals
on V* and shall write V = V**. We remark that if {^J is the basis of V* dual to a basis
{Vi} of V, then {vi} is the basis of V = V** which is dual to (^J.
ANNIHILATORS
Let W he a subset (not necessarily a subspace) of a vector space V. A linear functional
^gV* is called an annihilator of W if 4>{w) = for every w GW, i.e. if <j>{W) = {0}.
We show that the set of all such mappings, denoted by W and called the annihilator of W,
is a subspace of V*. Clearly G W^. Now suppose ^, <r G W^. Then, for any scalars
a,b gK and for any w GW,
(a<ji + b<j){w) = a^(w) + b (t{w) — aO + bO =
Thus a^ + baG W and so W is a subspace of V*.
In the case that IF is a subspace of F, we have the following relationship between W and
its annihilator W.
Theorem 11.5: Suppose F has finite dimension and IF is a subspace of F. Then
(i) dim W + dim IF" = dim F and (ii) TF»» = W.
Here W" = {vGV: ^(v) = for every ^ G W>} or, equivalently, IF"" = (TF")" where
IF"" is viewed as a subspace of F under the identification of F and F**.
The concept of an annihilator enables us to give another interpretation of a homogeneous
system of linear equations,
anXi + ai2X2 + • • • + ainXn —
(*)
amlXi + am2X2 + • • • + UmnXn —
252 LINEAR FUNCTIONALS AND THE DUAL SPACE [CHAP. 11
Here each row {an, oa, . . ., (kn) of the coefficient matrix A = (an) is viewed as an element
of K" and each solution vector ^ = {xi, X2, . . ., Xn) is viewed as an element of the dual space.
In this context, the solution space S of (*) is the annihilator of the rows of A and hence of
the row space of A. Consequently, using Theorem 11.5, we again obtain the following
fundamental result on the dimension of the solution space of a homogeneous system of
linear equations:
dimS = dimK" — dim (row space of A) = n  rank (A)
TRANSPOSE OF A LINEAR MAPPING
Let T :V ^ U be an arbitrary linear mapping from a vector space V into a vector space
U. Now for any linear functional ^ G U*, the composition ^ o T is a linear mapping from
V into K:
That is, (j)oT GV*. Thus the correspondence
</> h» <f>oT
is a mapping from U* into V*; we denote it by T' and call it the transpose of T. In other
words, T*:TJ* ^ V* is defined by
r'(0) = 4,oT
Thus {T\4,)){v) = ^{T{v)) for every v &V.
Theorem 11.6: The transpose mapping T' defined above is linear.
Proof. For any scalars a,b G K and any linear functionals <^, a G f/*,
T*{a<j> + ba) = (a^ + 6tr)or = a{ci>oT) + b(aoT)
= a T\4,) + h T*{a)
That is, T' is linear as claimed.
We emphasize that if T is a linear mapping from V into U, then T* is a linear mapping
from U* into V*: ^ j,,
The name "transpose" for the mapping T* no doubt derives from the following theorem.
Theorem 11.7: Let T.V^V be linear, and let A be the matrix representation of T rel
ative to bases {Vi} of V and {Ui} of V. Then the transpose matrix A* is
the matrix representation of T*:U** V* relative to the bases dual to
{Mi} and {Vi}.
CHAP. 11] LINEAR FUNCTIONALS AND THE DUAL SPACE 253
Solved Problems
DUAL SPACES AND BASES
11.1. Let : R2 * R and a : R^ * R be the linear functionals defined by ^{x, y) = x + 2y
and (t{x, y)  Sx y. Find (i) ^ + a, (ii) 4^, (iii) 2^  5<r.
(i) (^ + »)(«, J/) = i>(x,y) + a(x,y) = x + 2y + 3x  y = 4x + y
(ii) (40)(aj,2/) = 4 0(a;,2/) = i(x + 2y) = ix + 8y
(iii) {2<p~5a){x,y) = 2<p{,x,y)  ha(x,y) = 2(a; + 2j/)  5(3x  j/) = 13a; + 9j/
11.2. Consider the following basis of R^: {vi = (1,1,3), Vi = (0,1,1), Vs = (0,3,2)}.
Find the dual basis {^j, 4)^, ^g}.
We seek linear functionals
^i(«, y, «) = Oi* + azv + aaz, 4>^(x, y, z) = h^x + h^y + bgz, 03(3;, j/, z) = e^x + CjW + CgZ
such that 0i(i'i) = 1 0i(v2) = 01(^3) =
02(i;i) = 02('y2) = 1 ?*2(''3) =
03(^1) = 03(^2) = 03(1^3) = 1
We And ^^ as follows:
4>\ko\) = 0i(l. 1, 3) = «!  a2 + 303 = 1
0l(^'2) = 0i(0, 1, 1) = fflj  tts =
01(^3) = 0i(0, 3, 2) = 3a2  2a3 =
Solving the system of equations, we obtain a,^ = 1, 02 = 0, 03 = 0. Thus 0i(a;, V, 2) = «.
We next find 02
02(^1) = 02(1. 1. 3) = 61  62 + 363 =
02(^2) = 02(0,1,1) = 62 63 = 1
02(1^3) = 02(0,3,2) = 362263 =
Solving the system, we obtain 61 = 7, h^— —2, 63 = —3. Hence 02(a', y, z) — 7x — 2y — 3«.
Finally, we find 03:
03(^1) = 03(1, 1, 3) = Ci  C2 + 3C3 =
03('"2) = 03(0,1,1) = C2 C3 =
03(^3) = 03(0,3,2) = 3c22c3 = 1
Solving the system, we obtain Cj = —2, C2 = 1, Cg = 1. Thus 03(x, y, z) = —2x + y + z.
11.3. Let V be the vector space of polynomials over R of degree — 1, i.e. V =
{a + bt: a,b GR}. Let ^j:F»R and ^2 = ^"*'' be defined by
<^i(/(i)) = S^'mdt and Um) == S^'fi*)^^
(We remark that <j>^ and ^^ are linear and so belong to the dual space V*.) Find the
basis {vi, t;2} of V which is dual to {<^j, ^g}.
Let Vi = a + bt and i^a = c + dt. By definition of the dual basis,
0i(^i) = 1, 02('yi) = and 0i(V2) = 0, 02(^2) = 1
Thus
0j(vj) = j (a+6t)df = a + ^b = 1
02(''i) = I (a + 6t) dt = 2a + 26 =
or a = 2, 6 = 2
254 LINEAR FUNCTIONALS AND THE DUAL SPACE [CHAP. 11
01(^2) = I {c + dt)dt  c + ^d 
MV2) = ( {c + dt)dt  2c + 2d  1
or c = —1, d = 1
In other words, {2 — 2t, —^ + t} is the basis of V which is dual to {961, 02)
11.4. Prove Theorem 11.1: Suppose {vi, . . .,Vn} isa basis of V over K. Let <l>^, . . . , <j>^ & V*
be the linear functionals defined by
fl if i = i
S.(v.) = 8.. ^ i
Then {^j, . . .,^„} is a basis of V*.
We first show that {^i, . . .,<i6„} spans V*. Let be an arbitrary element of V*, and suppose
Set <r — ki<f>i + • • ■ + fc„0„. Then
a(Vi) = (fci0i + • • • + fe„0„)(lJi)
= fei 01(1^1) + k2<p2{Vi) + ••• + fe„0„(Vl)
= fci • 1 + ^2 • + • • • + &„ • = fci
Similarly, for i = 2, . . . , w,
<T(Di) = (fci0i + • ■ ■ + /i:„0„)(i;i)
= A;i0i('!;i) + ••• + ki^iivi) + •■• + fe„0n(i'i) = ^i
Thus <p{Vi) = a{Vi) for i = 1, . ..,n. Since and a agree on the basis vectors, = cr =
&101 + • • • + fe„0„. Accordingly, {0i 0„} spans V*.
It remains to be shown that {0i, . . . , J is linearly independent. Suppose
ai0i + a202 + • • • + a„0„ =
Applying both sides to v^, we obtain
= 0(vj) = {aj0i + • ■ • + a„0n)(i'i)
= ai0i(l'i) + ttz 02(^1) + ■•• + O'nSi'nC'yi)
= »! • 1 + a2 • + ■ • • + a„ • = tti
Similarly, for i = 2, . . . , «,
= ©(Ui) = (ai0i ^ + a„^„)(Vi)
= al 0l('yt) + • • • + »i S6i(^i) + • • • + «n S^nC^'i) = "i
That is, «! = 0, . . .,a.„ = 0. Hence {0i, ...,</>n} is linearly independent and so it is a basis of V*.
11.5. Prove Theorem 11.2: Let {vi, . . . , v„} be a basis of V and let {<j>^, .. .,<f,Jhe the dual
basis of V*. Then, for any vector uGV,
and, for any linear functional a GV*,
a = cr(i;>j + .7(1;,)^, + • • • + <7(i;J<^„ (2)
Suppose M — aiVi + a^Vi + • • • + «„«„ (5)
Then
0i(m) = di 0i(i;i) + ^2 01(^2) + • ■ • + «« SilC'^n) = «! • 1 + O2 • + • • • + «■« • == «1
CHAP. 11] LINEAR FUNCTIONALS AND THE DUAL SPACE 255
Similarly, for i = 2, . . .,n,
</>i{u) = »! ,j,i(vi) + •■■ + Ui <f>i{Vi) + •■■ + a„ 0i(i;„) = »;
That is, ipiiu) = Oj, 02(w) = a2' • • •. 0nW = «n Substituting these results into (3), we obtain (1).
Next we prove (2). Applying the linear functional a to both sides of (1),
a{u) = 0i(M)<r(vi) + ^aM^K) + ••• + </>n(u) <r(v„)
= o{Vi) ^i(w) + o(V2) 02(m) + • • • + a{vj <f,„{u)
= {<'{Vi)'f>l + ('(^2)02 + • • • + <r(t)„)0„)(M)
Since the above holds for every m G V, a — a{vi)<f,i + <r(i^2)02 + • • • + "{vj^^ as claimed.
11.6. Prove Theorem 11.3: Let {vu...,Vn} and {wi,...,Wn} be bases of V and let
{^1, • . . , ^„} and (<7j, . . . , CT„} be the bases of V* dual to {vi} and {Wt} respectively.
Suppose P is the transition matrix from {Vi} to {Wi}. Then (P~»)' is the transition
matrix from {(j>J to {<tJ.
Suppose
Wi = OuUi + ai2V2 + • • ■ + ai„i;„ <ri = ftn^i + 6i202 + • ■ ■ + 6i„0„
W2 = (121^1 + a22'«'2 + ■ ■ • + a2nVn 02 = 62101 + ^2202 + " " " + hn'f'n
w„ = a„ii)i + a„2i'2 + • ■ • + a„„v„ a„  b^i,pi + 6„202 + ' • • + 6„n0„
where P = («„) and Q = (6y). We seek to prove that Q = (Pi)«.
Let Ri denote the tth row of Q and let Cj denote the ith column of P«. Then
■Rj = (6ti. 6i2. • • • . 6i„) and Cj = (dji, aj2, . . . , a^^Y
By definition of the dual basis,
<fi(Wj) = (6(101 + 6i202 + • • • + 6j„^„)(ajii)i + aj2V2 + ■ • • + aj„v„)
= 6jiaji + 6j2aj2 + ■ • • + 6i„a.j„ = Rfij = Sjj
where Sy is the Kronecker delta. Thus
/KiCi iJ,C2 ... K„C„\
QPt = ^2^1 R2C2 • • ■ R2C„  !"■'■ •■■"1=7
\RnCl Rvp2 ■ ■ ■ ^rflnl
and hence Q = (P«)i = (Pi)« as claimed.
11.7. Suppose V has finite dimension. Show that if v GV, v¥'0, then there exists
<j>GV* such that <^(v) # 0.
We extend {v} to a basis {i), ^2. • • • > ^n) of V By Theorem 6.1, there exists a unique linear
mapping <f>:V ^ K such that 0(1;)  1 and ^(i;^) = 0, i = 2, . . . , n. Hence ^ has the desired
property.
11.8. Prove Theorem 11.4: If V has finite dimension, then the mapping v ^ v is an
isomorphism of V onto V**. (Here v : V* * K is defined by v{<j>) = ^(v).)
We first prove that the map v \^ v is linear, i.e. for any vectors v,w eV and any scalars
a,b & K, av + bw = av + bw. For any linear functional <fi G V*,
av + bw (0) = <f,{av + bw) = a ^{v) + b (f>{w)
= av(</,) + bw(4,) = (av + bw)(ip)
Since av + bw (<i>) = (at) + 6M))(0) for every <p&V*, we have SaT+Vw = 0?+ 6w. Thus the
At.
map v 1^ 1; is linear.
256 LINEAR FUNCTIONALS AND THE DUAL SPACE [CHAP. 11
Now suppose V GV, v ¥= 0. Then, by the preceding problem, there exists <f> G V* for which
<f,{v) # 0. Hence v (<j,)  <t,(v) # and thus v ¥^ Q. Since v # implies v¥=Q, the map v H r
is nonsingular and hence an isomorphism (Theorem 6.5).
Now dim V = dim V* = dim V** because V has finite dimension. Accordingly, mapping v ^ v
is an isomorphism of V onto V**.
ANNIHILATORS
11.9. Show that if <j>GV* annihilates a subset S of V, then ^ annihilates the linear span
L{S) of S. Hence S» = (L(S))».
Suppose V e L{S). Then there exist Wj, . . .yW^G S for which v = a^w^ + a^w^ + • • • + a^w^.
<f>{v) = Ui 0(Wi) + (12 0(W2) + ■ ■ ■ + «r ('('^r) = "l" + ^^20 + • • ' + afi =
Since v was an arbitrary element of L(S), <f> annihilates L(S) as claimed.
11.10. Let W be the subspace of R* spanned by vi = (1, 2, 3, 4) and v^ = (0, 1, 4, 1). Find
a basis of the annihilator of W.
By the preceding problem, it suffices to find a basis of the set of linear functionals <f>{x, y, z, w) =
ax + hy + CZ + dw for which 4>{vi) = and <p(v2) — 0:
0(1,2,3,4) = a + 2&  3c + 4d =
0(0,1,4,1) = 6 + 4cd =
The system of equations in unknowns a, h, c, d is in echelon form with free variables c and d.
Set c = 1, d = to obtain the solution o = 11, 6 = 4, c = 1, d = and hence the linear func
tional 0i(a;, y, z, w) = 11a; — 4j/ + z.
Set c = 0, d = 1 to obtain the solution a = 6, 6 = 1, c = 0, d = 1 and hence the linear func
tional 02(«^. 2/. «. *") = &x ~ y — w.
The set of linear functionals {0i, 02} is a basis of W>, the annihilator of W.
11.11. Show that: (i) for any subset S of V, S^S""; (ii) if S1CS2, then S^cS?.
(i) Let V e S. Then for every linear functional e S*, v(0) = 0(v) = 0. Hence v G (SO)*.
Therefore, under the identification of F and V**, v S S»o. Accordingly, S C S"*.
(ii) Let 0GS2. Then (■!;) = for every ■« G Sg. But S1CS2; hence annihilates every ele
ment of Si, i.e. G Si. Therefore S^ cSj.
11.12. Prove Theorem 11.5: Suppose V has finite dimension and W is & subspace of V.
Then (i) dim W + dim W = dim V and (ii) W^ = W.
(i) Suppose dim V = w and dim W = r  n. We want to show that dim W'> = nr. We choose
a basis {wi, ...,w^}oiW and extend it to the following basis of V: {wi, . . .,w„ Vi, . . ., v„_r}
Consider the dual basis , ,
\01i • • • > 0r> "it • • • > <'tir/
By definition of the dual basis, each of the above a's annihilates each Wj, hence
„! ff„_, G W*. We claim that {ctj} is a basis of W'". Now {<tj} is part of a basis of V* and
so it is linearly independent.
We next show that {a^} spans W. Let <r G W<>. By Theorem 11.2,
<7  a(Wi)0i 4 • • • + <T{Wr)</,r + a(Vi)ai + ••■ + a{v^r)<'nr
 001 4 • • • + 00, I tr(Vi)(Ti + • • • + CT(v„_r)<r„r
= <f(''l)<'l + • • • + aiv^rWr
Thus {ffi ffnr) spans PF" and so it is a basis of W^. Accordingly, dim W^" = m — r =
dim V — dim W as required.
(ii) Suppose dimV^^w and diraW = r. Then dim F* = m and, by (i), dimTF<' = TOr. Thus
by (i), dim TVO" = n (nr) = r; therefore dim W = dim W". By the preceding problem,
W C TFOO. Accordingly, W = WO".
CHAP. 11] LINEAR PUNCTIONALS AND THE DUAL SPACE 257
11.13. Let U and W be subspaces of V. Prove: (C7 + Wf = U° n W^.
Let 0G {U+W)0. Then annihilates U+W and so, in particular, ,p annihilates U and y.
That is, e t/o and G TF"; hence £ C/o n PFO. Thus {U + W)" C U" n W.
On the other hand, suppose a e W n W. Then a annihilates U and also W. If ve^U+W,
then y = M + w where m £ [7 and w e W. Hence ctCv) = (r{u) + a{w) = + = 0. Thus a
annihilates U+W, i.e. a e. (U + TF)". Accordingly, U'>+W'>c(U+ W)'>.
Both inclusion relations give us the desired equality.
Remark: Observe that no dimension argument is employed in the proof; hence the result holds
for spaces of finite or infinite dimension.
TRANSPOSE OF A LINEAR MAPPING
11.14. Let <j> be the linear functional on R^ defined by ^(a;, y)  x  2y. For each of the
following linear operators T on R^, find iT%4,)){x, y): (i) T{x, y) = {x, 0); (ii) T{x,y) =
(y,x + y); (iii) T{x,y) = (2x~Zy,hx + 2y).
By definition of the transpose mapping, rf(0) = .pof, i.e. (Tt{4,)){v) = <p{T(v)) for every
vector V. Hence
(i) {Tt{,p)){x, y) = </,(T{x,y)) = 0(x, 0) = x
(ii) (rt(0))(x, 2/) = 4>{T{x,y)) = <f>{y,x + y) = y  2{x + y) = 2x  y
(iii) {Tt(<f,))(x, y) = <f.(T(x, y)) = ,f,(2x  3y, 5x + 2y) = i2x  3y)  2i5x + 2y) = ~8xly.
11.15. Let T.V^U be linear and let T*:U*^ V* be its transpose. Show that the kernel
of T' is the annihilator of the image of T, i.e. Ker T* = (Im T)".
Suppose e Ker T*; that is, r'(0) = o y = o. If m G Im T, then m = T(v) for some
•y G V; hence
0(m) = 0(7(1;)) = (4>°T){v) = 0{v) =
We have that <p{u) = for every m G Im T; hence G (Im 2^». Thus Ker T* C (Im r)*.
On the other hand, suppose o G (Im T)"; that is, ^(Im T) = {0}. Then, for every v &V,
(THamv) = {aoT)(v) = a(T{v)) = = 0(v)
We have that (r'(a))('y) = Q(v) for every ■!; G Y; hence T*(a) = 0. Therefore a S Ker T* and so
(Im r)o c Ker TK
Both inclusion relations give us the required equality.
11.16. Suppose V and U have finite dimension and suppose T:V ^ U is linear. Prove:
rank(r) = rank(rO
Suppose dim V = n and dim U = m. Also suppose rank (T) = r. Then, by Theorem 11.5,
dim ((Im T)0) = dim 17  dim (Im T) = m  rank (T) = m  r
By the preceding problem, Ker Tt = (Im T)'>. Hence nullity (T') = m — r. It then follows that, as
claimed,
rank(r«) = dim C/*  nullity (T') = m ~ (mr) = r = rank (T)
11.17. Prove Theorem 11.7: Let T:V ^ U be linear and let A be the matrix representation
of T relative to bases {vi, . . ., Vm} of V and {ui, . . . , it„} of U. Then the transpose
matrix A* is the matrix representation of T:JJ* ^ V* relative to the bases dual to
[Ui] and {Vj}.
Suppose ^(^i) = aiiUi + a^^ui + • ■ • + ain^n
T{V2) = a2iUi + a22M2 + • • • + a2nU„ .^.
258 LINEAR FUNCTIONALS AND THE DUAL SPACE [CHAP. 11
We want to prove that
r'(o'2) = «1201 + <*2202 + • • ■ + ttm2'^m
{2)
r'(of„) — ain't'l + '''•2ti02 + • • • + 0'mn</>m
where {(tJ and {^j} are the bases dual to {mJ and {vj} respectively.
Let V e.V and suppose v = k^v^ + fejVa + • • ■ + fc„v„. Then, by (1),
T(v) = fci r(i;i) + ^2 r(i'2) + • • ■ + km TivJ
= &i(aiiMi + ■ • • + aj„M„) + fc2(a2lMl + • • • + «2n**n) + • • • + fcm(a'mlMl + ■ • ' + Cmn'^'n)
= (fciaii + fc2a21 + • • • + fcmaml)^! + ' • • + (fcitlln + fc2*2n + ' " ' + k„a'mn)'>^n
n
= 2 (fciOii + fc2«2i + • • • + kmOmdUi
i=l
Hence for ;/ = 1, . . . , n,
(Tt(aj)(v)) = ffjCrCv)) = 0j ( 2 (feiaii + fc2(l2i + 1 fcm«mi)«t )
= fciOij + fe2a.2j + • • ■ + k^amj (^)
On the other hand, for j = 1, . . .,n,
(aij0i + a2j02 + • • • + a^j^J(v) = (ajj^i + a2j4'2 +■■■+ amj0m)(fei'^i + A:2'U2 + • " ' + km^m)
= k^aij + k2a2j + • • • + k^a^j {i)
Since v e.V was arbitrary, (3) and (4) imply that
r'(CTj) = aij0i + a2j*2 + • • • + a.mj't'm' i = 1, . . ., m
which is (2). Thus the theorem is proved.
11.18. Let A be an arbitrary mxn matrix over a field K. Prove that the row rank and the
column rank of A are equal.
Let T:K«> K'n be the linear map defined by T{v) = Av, where the elements of X" and K^
are written as column vectors. Then A is the matrix representation of T relative to the usual bases
of if" and K", and the image of T is the column space of A. Hence
rank (T) = column rank of A
By Theorem 11.7, A« is the matrix representation of T* relative to the dual bases. Hence
rank (T') = column rank of A* = row rank of A
But by Problem 11.16, rank(r) = rank(r'); hence the row rank and the column rank of A are
equal. (This result was stated earlier as Theorem 5.9, page 90, and was proved in a direct way
in Problem 5.21.)
Supplementary Problems
DUAL SPACES AND DUAL BASES
11.19. Let : R3 » R and <t : R3 > R be the linear functionals defined by ^(x, y,z) = 2xBy + z and
a(x, y, z) = Ax2y + 3z. Find (i) <!> + <', (ii) 3^, (iii) 20  5<r.
11.20. Let be the linear functional on R2 defined by 0(2,1) = 15 and 0(1,2) = 10. Find ^(x,y) and,
in particular, find 0(— 2, 7).
11.21. Find the dual basis of each of the following bases of R^:
(i) {(1, 0, 0), (0, 1, 0), (0, 0, 1)}, (ii) {(1, 2, 3), (1, 1, 1), (2, 4, 7)}.
CHAP. 11] LINEAR FUNCTIONALS AND THE DUAL SPACE 259
11.22. Let V be the vector space of polynomials over B of degree  2. Let 0i, <f>2 and ^3 be the linear
functionals on V defined by
Him) = f mat, <j>^{m) = m), .p^mm = m)
Here f{t) = a+bt + ct^eV and /'(f) denotes the derivative of f(t). Find the basis {/i(«), fzd), ^(t)}
of V which is dual to {^i, <f>2, 03}.
11.23. Suppose u,vGV and that <f>{u) = implies <p{v) = for all <f, G V*. Show that v = ku for
some scalar k.
11.24. Suppose 0,ffGy* and that <f>[v) = implies <t(v) = for all vGV. Show that a = k^, for
some scalar k.
11.25. Let V be the vector space of polynomials over K. For a^K, define <pa'V ^ K by 0a(/(*)) = /('')•
Show that: (i) 0„ is linear; (ii) if a ¥= b, then 0„ 7^ 0b
11.26. Let V be the vector space of polynomials of degree — 2. Let a,b,c€.K be distinct scalars. Let
0a, 0b and 0e be the linear functionals defined by 0a (/(«)) = /(a), 0b (/(«)) = /(6), 0c (/(*)) = /(c). Show
that {0a, 0b, 0c} is linearly independent, and find the basis {/i(t), /2(t), fait)} of V which is its dual.
11.27. Let V be the vector space of square matrices of order n. Let T iV ^ K be the trace mapping:
T{A) = ail + «22 + ■ ■ ■ + "rm> where A = (o.^). Show that T is linear.
11.28. Let W he a, subspace of V. For any linear functional on W, show that there is a linear functional
a on V such that a(w) = <p(w) for any w &W, i.e. is the restriction of a to TF.
11.29. Let {ci, . . . , e„} be the usual basis of K". Show that the dual basis is Wi, . . ., ir„} where vi is the
•ith projection mapping: viia^, . . .,a„) = a.;.
11.30. Let y be a vector space over B. Let 0j, 02 S V* and suppose <r : y > B defined by a(v) = 01(1;) 02('y)
also belongs to V*. Show that either 0i = or 02 = 0.
ANNIHILATORS
11.31. Let W be the subspace of B* spanned by (1,2,3,4), (1,3,2,6) and (1,4,1,8). Find a basis of
the annihilator of W.
11.32. Let W be the subspace of B^ spanned by (1, 1,0) and (0, 1, 1). Find a basis of the annihilator of W.
11.33. Show that, for any subset S of V, L{S) = S"" where L{S) is the linear span of S.
11.34. Let U and W be subspaces of a vector space V of finite dimension. Prove: {U n W)" = U" + W.
11.35. Suppose V  U @ W. Prove that V* = W ® WO.
TRANSPOSE OF A LINEAR MAPPING
11.36. Let be the linear functional on B^ defined by <l>(x,y) = Zx — 2y. For each linear mapping
r : B3 ^ B2, find (rK0))(a;, y, z):
(i) T{x,y,z) = (x + y,y + z); (ii) T(x,y,z) = (x + y + z,2xy).
11.37. Suppose S:U^V and TiV^W are linear. Prove that (ToS)t = StoTK
11.38. Suppose T:V *U is linear and V has finite dimension. Prove that Im T' = (Ker T)".
11.39. Suppose r : y » [7 is linear and u G U. Prove that u GlmT or there exists •/> GV* such that
rt(0) = and 0(m) = 1.
11.40. Let y be of finite dimension. Show that the mapping T h> Tt is an isomorphism from Hom (V, V)
onto Hom {V*, V*). (Here T is any linear operator on V.)
260 LINEAR PUNCTIONALS AND THE DUAL SPACE [CHAP. 11
MISCELLANEOUS PROBLEMS
11.41. Let y be a vector space over R. The line segment uv joining points m, v S V is defined by
wv — {tu + (1 — t)v: a — t — 1). A subset S of y is termed convex if u,v GS implies uvcS.
Let <6 e y* and let
W+ = {vGV: 4,(v) > 0}, W = {vGV: <t>(v) = 0}, W" = {vi=,V: <t>(v) < 0}
Prove that W + , W and W~ are convex.
11.42. Let y be a vector space of finite dimension. A hyperplane i/ of y is defined to be the kernel of a
nonzero linear functional </> on V. Show that every subspace of V is the intersection of a finite
number of hyperplanes.
Answers to Supplementary Problems
11.19. (i) 6x5y + 4z, (ii) ex9y + Sz, (iii) 16x + iy  ISz
11.20. <p(x, y) = Ax + ly, 0(2, 7) = 41
11.21. (i) {^i{x,y,z) = X, <t,2(x,y,z)  y, <t>z{x,y,z) = z}
(ii) {>t,i(x, y, z) = 3x hy 2z, ^ii'^, y, z) = 2x + y, </>3(x, y,z) = x + 2y + z]
11.25. (ii) Let f{t) = t. Then ^a(/(*)) = a ^ 6 = 0b(/(O). and therefore ^„ # 0b
11.26. /i(«)  („_6)(„_e) . h(t)  (6_„)(5_^) . /3W (^_„)(,_6) I
11.31. {0i(a;, 2/, z, t) = 5x  2/ + 2, 02(«. 1/. «. *) = 22/  t}
11.32. {<j>(x, y,z) = xy + z)
11.36. (i) (Tt(<t>))(x,y,z)=Zx + y2z, (ii) (r'(0))(a;,2/,z) = x + Sj/ + 3x.
chapter 12
Bilinear, Quadratic and Hermitian Forms
BILINEAR FORMS
Let F be a vector space of finite dimension over a field K. A bilinear form on V is a
mapping f.VxV^K which satisfies
(i) f{aui + bu2, v) = af{ui, v) + bf{u2, v)
(ii) f{u, avi + bvi) = af{u, Vi) + bf{u, Vi)
for all a,b&K and all im, Vi e V. We express condition (i) by saying / is linear in the
first variable, and condition (ii) by saying / is linear in the second variable.
Example 12.1:
Example 12.2:
Example 12.3:
Let <j> and a be arbitrary linear f unctionals on V. Let / : V X y » K be defined by
f(u,v) = 4>(u)<t(v). Then / is bilinear because (p and a are each linear. (Such a
bilinear form / turns out to be the "tensor product" of <f> and <t and so is sometimes
written / = ^ (g) a.)
Let / be the dot product on R"; that is,
f{u, v) = u'v = a^bi + a^h^ + • • • + a„6„
where u = (a^) and v = (h^). Then / is a bilinear form on R".
Let A = (ffly) be any nXn matrix over K. Then A may be viewed as a bilinear
form / on X" by defining
' an ai2 ... a.i„ \ I Vi
a^i 0.22 • ■ ■ "2n I I 2/2
f(X,Y) = XtAY = {xi,X2 a; J
*nl °'n2
«ll«l2/l + «12»'l2/2 +
= 2
U
^ij^iVj
Vnl
The above formal expression in variables Xi,yi is termed the bilinear polynomial
corresponding to the matrix A. Formula (i) below shows that, in a certain sense,
every bilinear form is of this type.
We will let B{V) denote the set of bilinear forms on V. A vector space structure is
placed on B(V) by defining f + g and kf by:
{f + 9){u,v) = f{u,v) + g{u,v)
{kf){u,v) = kf{u,v)
for any f,g & B{V) and any kE^K. In fact,
Theorem 12.1: Let F be a vector space of dimension n over K. Let {4,^, ...,</>„} be a
basis of the dual space V*. Then {fij:i,j = 1,. • .,w} is a basis of B{V)
where fa is defined by fi}{u,v) = <j>^{u) <i>.{v). Thus, in particular,
dimB{V) = w'.
261
262 BILINEAR, QUADRATIC AND HERMITIAN FORMS [CHAP. 12
BILINEAR FORMS AND MATRICES
Let / be a bilinear form on V, and let {ei, . . . , e^} be a basis of V. Suppose u,v eV
and suppose
u = aiCi + • • • + a„e„, v = biCi + • • • + bnen
Then
f{u, v) — f{aiei + • • • + anCn, biCi + • • • + bne„)
n
= ai&i/(ei, ei) + 0162/(61,62) + ••• + a„b„/(en, e^) = ^ aib}f{ei,ej)
i,3 = 1
Thus / is completely determined by the n^ values /(ei, e,).
The matrix A = {an) where an — f{ei, e,) is called the matrix representation of f rel
ative to the basis {ei} or, simply, the matrix of f in {ei}. It "represents" / in the sense that
'&i^
f{u,v) = ^ aibjfiei, ej) = (ai, . . . , a„)A  ^ = Me^Me (1)
^bnj
for all u,v GV. (As usual, [u]e denotes the coordinate (column) vector of u GV in the
basis {ei}.)
We next ask, how does a matrix representing a bilinear form transform when a new
basis is selected ? The answer is given in the following theorem. (Recall Theorem 7.4 that
the transition matrix P from one basis {e^} to another {el} has the property that [u]e = P[u]e'
for every u G V.)
Theorem 12.2: Let P be the transition matrix from one basis to another. If A is the
matrix of / in the original basis, then
B = P'AP
is the matrix of / in the new basis.
The above theorem motivates the following definition.
Definition: A matrix B is said to be congruent to a matrix A if there exists an invertible
(or: nonsingular) matrix P such that B — P^AP.
Thus by the above theorem matrices representing the same bilinear form are congruent.
We remark that congruent matrices have the same rank because P and P' are nonsingular;
hence the following definition is well defined.
Definition: The rank of a bilinear form / on V, written rank(/), is defined to be the rank
of any matrix representation. We say that / is degenerate or nondegenerate
according as to whether rank (/) < dim V or rank (/) = dim V.
ALTERNATING BILINEAR FORMS
A bilinear form / on 7 is said to be alternating if
(i) f(v,v) =
for every v GV. If / is alternating, then
= f{u + v,u + v) = f{u, u) + f{u, v) + f{v, u) + f{v, v)
and so (ii) f{u,v) = f{v,u)
CHAP. 12] BILINEAR, QUADRATIC AND HERMITIAN FORMS 263
for every u,v GV. A bilinear form which satisfies condition (ii) is said to be skew sym
metric (or: antisymmetric). If 1 + 1 v^ in K, then condition (ii) implies f{v, v) = — f{v, v)
which implies condition (i). In other words, alternating and skew symmetric are equivalent
when 1 + 1 9^ 0.
The main structure theorem of alternating bilinear forms follows.
Theorem 12.3: Let / be an alternating bilinear form on V. Then there exists a basis of
V in which / is represented by a matrix of the form
Oil
1_04 ^
I l1
LzLjPJ,
I 1 j
11 I
T«4
lOj
ro
Moreover, the number of ( ,, ] is uniquely determined by / (because
it is equal to ^ rank (/)).
1
In particular, the above theorem shows that an alternating bilinear form must have
even rank.
SYMMETRIC BILINEAR FORMS, QUADRATIC FORMS
A bilinear form / on V is said to be symmetric if
f{u,v) = f{v,u)
for every u,v €.V. If A is a matrix representation of /, we can write
f{X,Y) = X'AY = {X'AYY = Y'A'^X
(We use the fact that X*AY is a scalar and therefore equals its transpose.) Thus if / is
symmetric,
Y*A*X = f(X,Y) = f{Y,X) =. Y'AX
and since this is true for all vectors X, Y it follows that A = A* or A is sjTnmetric. Con
versely if A is symmetric, then / is symmetric.
The main result for symmetric bilinear forms is given in
Theorem 12.4: Let / be a symmetric bilinear form on V over K (in which 1 + 1^0).
Then V has a basis {vi, . . .,v„} in which / is represented by a diagonal
matrix, i.e. f{vi, Vj) = for i ¥ j.
Alternate Form of Theorem 12.4: Let A be a symmetric matrix over X (in which 1 + 1 ^^^O).
Then there exists an invertible (or: nonsingular) matrix P such that P*AP
is diagonal. That is, A is congruent to a diagonal matrix.
264
BILINEAR, QUADRATIC AND HERMITIAN FORMS
[CHAP. 12
Since an invertible matrix P is a product of elementary matrices (Problem 3.36), one way
of obtaining the diagonal form P*AP is by a sequence of elementary row operations and
the same sequence of elementary column operations. These same elementary row opera
tions on / will yield PK This method is illustrated in the next example.
Example 12.4: Let A 
matrix (A, I)
1 23
2 5 — 4 , a symmetric matrix. It is convenient to form the block
3 4 8
(A.I)
1
2
3
1
2
5
4
1
3
4
8
1
We apply the operations R^ » — 2/Ji + R^ and R^ ^ SR^ + R3 to {A, I), and then
the corresponding operations C^ ^ — 2Ci + C^ and Cg * 3Ci + C3 to A to obtain
1
2 3
1
1 2
2
1
2 1
3
1
and then
We next apply the operation ^3 > —2R2 + R3 and then the corresponding operation
C3 ♦ — 2C2 + C3 to obtain
/l
1
1
2
2
1
5
7
2
and then
1
1
1
0!
2
1
5 1
7
2
Now A has been diagonalized. We set
P 
and then Pt A P
Definition: A mapping q:V*K is called a quadratic form if q{v) = f{v,v) for some
symmetric bilinear form / on V.
We call q the quadratic form associated with the symmetric bilinear form /. If
1 + 1 7^ in if, then / is obtainable from q according to the identity
f(u,v) = Uq{u + v)  q{u)  q{v))
The above formula is called the polar form of /.
Now if / is represented by a symmetric matrix A = (an), then q is represented in the
form
(an ai2 ... ain\ /xi\
a21 ^22 ... azn \l X2
...
ttnl a„2 ... annj \Xnj
= y ttiiXiXj  aiiccf + 022X2 + • • • + annxl +22 atiXiXj
u *<^
The above formal expression in variables Xi is termed the quadratic polynomial correspond
ing to the symmetric matrix A. Observe that if the matrix A is diagonal, then q has the
diagonal representation
q{X)  X*AX  anxl + a22xl + ••• + annxl
that is, the quadratic polynomial representing q will contain no "cross product" terms. By
Theorem 12.4, every quadratic form has such a representation (when 1 + 1^0).
CHAP. 12] BILINEAR, QUADRATIC AND HERMITIAN FORMS 265
Example 12.5: Consider the following quadratic form on R^:
<l(^> v) = 2x2 _ \2,xy + 5j/2
One way of diagonalizing q is by the method known as "completing the square"
which is fully described in Problem 12.35. In this case, we make the substitution
X = s + Zt, y = t to obtain the diagonal form
q(x, y) = 2(s + 3t)2  12(s + U)t + 5(2  2s^  13*2
REAL SYMMETRIC BILINEAR FORMS. LAW OF INERTIA
In this section we treat symmetric bilinear forms and quadratic forms on vector
spaces over the real field R. These forms appear in many branches of mathematics and
physics. The special nature of R permits an independent theory. The main result follows.
Theorem 12.5: Let / be a symmetric bilinear form on V over R. Then there is a basis of
V in which / is represented by a diagonal matrix; every other diagonal
representation has the same number P of positive entries and the same
number N of negative entries. The difference S = P — N is called the
signature of /.
A real symmetric bilinear form / is said to be nonnegative semidefinite if
q{v) = f{v,v) ^
for every vector v; and is said to be positive definite if
q{v) = f{v,v) >
for every vector v ¥=0. By the above theorem,
(i) / is nonnegative semidefinite if and only if S — rank (/)
(ii) / is positive definite if and only if S = dim V
where <S is the signature of /.
Example 12.6: Let / be the dot product on R"; that is,
f(u, v) = w V = eii6i + a2&2 + • • " + "n^n
where u = (ttj) and v — (6j). Note that / is symmetric since
f(u, v) = W V = vu = f{v, u)
Furthermore, / is positive definite because
f(u, u) = a^+ al+ ■■■ + al >
when M v^ 0.
In the next chapter we will see how a real quadratic form q transforms when the transi
tion matrix P is "orthogonal". If no condition is placed on P, then q can be represented
in diagonal form with only I's and — I's as nonzero coefficients. Specifically,
Corollary 12.6: Any real quadratic form q has a unique representation in the form
qytVif . . . , iXyn) — vCi I ' * * ~T' i^g <jO
2
s + 1
The above result for real quadratic forms is sometimes referred to as the Law of Inertia
or Sylvester's Theorem.
266 BILINEAR, QUADRATIC AND HERMITIAN FORMS [CHAP. 12
HERMITIAN FORMS
Let y be a vector space of finite dimension over the complex field C. Let / : V xV * C
be such that
(i) f{aui + bu2,v) = af{ui,v) + bf{u2,v)
(ii) f{u,v) ^ fiv,u)
where a, 6 G C and im, v &V. Then / is called a Hermitian form on V. (As usual, k
denotes the complex conjugate of A; G C.) By (i) and (ii),
f{u, avi + bv2) = f{avi + bv2, u) — af{vi, u) + b f{v2, u)
= df{vi,u) + bf{v2,u) = df{u,Vi) + bf{u,V2)
That is, (iii) f{u, avi + bv2)  a f{u, Vi) + b f{u, Vi)
As before, we express condition (i) by saying / is linear in the first variable. On the other
hand, we express cond ition ( iii) by saying / is conjugate linear in the second variable. Note
that, by (ii), f{v, v) = f{v, v) and so f{v, v) is real for every v GV.
Example 12.7: Let A = (dy) be an Ji X w matrix over C. We write A for the matrix obtained by
taking the complex conjugate of every entry of A, that is, A = (a^). We also write
A* for A« = AJ. The matrix A is said to be Hermitian if A* = A, i.e. if ay = a]^.
If A is Hermitian, then f{X, Y) ^X'^AY defines a Hermitian form on C" (Prob
lem 12.16).
The mapping q:V>B, defined by q{v) = f{v, v) is called the Hermitian quadratic form
or complex quadratic form associated with the Hermitian form /. We can obtain / from
q according to the following identity called the polar form of /:
f{u, v) = l{q{u + v) q{u  v)) + j(q{u + iv)  q{u  iv))
Now suppose {ei, . . . , e„} is a basis of V. The matrix H = (fo«) wher e feij = /(ei, e,) is
called the matrix representation of / in the basis {ej}. By (ii), f{ei,ej) = f(ej,ei); hence H
is Hermitian and, in particular, the diagonal entries of H are real. Thus any diagonal rep
resentation of / contains only real entries. The next theorem is the complex analog of
Theorem 12.5 on real symmetric bilinear forms.
Theorem 12.7: Let / be a Hermitian form on V. Then there exists a basis {ei, . . ., en} of
V in which / is represented by a diagonal matrix, i.e. f{ei, ei) = for
i ¥= j. Moreover, every diagonal representation of / has the same number
P of positive entries, and the same number N of negative entries. The
difference S = PN is called the signature of /.
Analogously, a Hermitian form / is said to be nonnegative semidefinite if
q{v) = f{v,v) ^
for every v eV, and is said to be positive definite if
q{v) = f{v,v) >
for every v ¥=0.
Example 12.8: Let / be the dot product on C"; that is,
f(u, V) — U'V = ZiWi + Z2W2 + ■ ■ • + z„w„
where u = («{) and v = (Wj). Then / is a Hermitian form on C". Moreover, / is
positive definite since, for any v # 0,
f(u, u) = zi«i + Z2,Z2+ •■■ + z„z„ = l^iP + I22P + • • • + K\^ >
CHAP. 12] BILINEAR, QUADRATIC AND HERMITIAN FORMS 267
Solved Problems
BILINEAR FORMS
12.1. Let u = {xi, X2, xs) and v = (t/i, 1/2, ya), and let
f{u, v) = Sxiyi — 2xiyi + 5x22/1 + 7*21/2  8a;2i/3 + 4x33/2  Xzys
Express / in matrix notation.
Let A be the 3X3 matrix whose i;entry is the coefficient of XiVy Then
/3 2 0\/j/i^
f{u,v) = XtAY = (^i.xa.ajg) 5 78 :
\0 4 l/\2/3i
12.2. Let A be an « X « matrix over K. Show that the following mapping / is a bilinear
form on K": f{X,Y) = X*AY.
For any a,bGK and any Xj, Fj e K",
/(aATi + 6Z2, F) = (aZi + 6X2)*^ Y = (aXj + 6X^) .4 F
= aXlAY + bXlAY = a/(Zi, F) + 6/(X2, F)
Hence / is linear in the first variable. Also,
/(X, aFi + ftFa) = XtA(aFi + 6F2) = aX^AYi + bX^AY^ = a /(X, Fj) + 6 /(X, F2)
Hence / is linear in the second variable, and so / is a bilinear form on K".
12.3. Let / be the bilinear form on R^ defined by
f{{xu X2), (yi, 2/2)) = 2xi2/i  3xi2/2 + X22/2
(i) Find the matrix A of / in the basis {Ui — (1,0), 112 = (1, 1)}.
(ii) Find the matrix B of / in the basis {vi = (2, 1), V2 = (1, 1)}.
(iii) Find the transition matrix P from the basis {mi} to the basis {Vi}, and verify
that B^P*AP.
(i) Set A = (ay) where ay = /(ttj, Uj):
an = f(ui,ui) = /((1,0), (1,0)) =20 + 0= 2
ai2 = /(mi,M2) = /((1,0), (1,1)) = 23 + = 1
021 = /(M2.M1) = /((1,1), (1,0)) = 20 + 0= 2
022 = /(M2.W2) = /((l.l), (1,1)) = 23 + 1 =
=(r;)'
Thus A = ( ft / *^ *^® matrix of / in the basis {u^, n^}.
(ii) Set B = (6y) where 6y = /(Vj,!;,):
611 = /(^i.i'i) = /((2,1), (2,1)) = 86 + 1 = 3
612 = /(^i.'y2) = /((2.1), (1,1)) = 4 + 61 = 9
621 = f(>'2,'>'i) = /((l.l), (2,1)) = 431 =
622 = /(^2.^2) = /((I, 1), (1,1)) = 2 + 3 + 1 = 6
/3 9\
Thus B = I ft / 's ^^^ matrix of / in the basis {vi, v^}.
(iii) We must write Vi and V2 in terms of the Mj:
ri = (2,1) = (1,0) + (1,1) = M1 + M2
V2 = (1,1) = 2(1,0) (1,1) = 2MiM2
268 BILINEAR, QUADRATIC AND HERMITIAN FORMS [CHAP. 12
Then ^ = ( J _j) and so P' = Q _M . Thus
12.4. Prove Theorem 12.1: Let F be a vector space of dimension n over K. Let {^j, . . . , ^„}
be a basis of the dual space V*. Then {/«: ij" = 1,. . .,%} is a basis of B{V) where
/« is defined by f^.{u,v) = ^.(m)^.(v). Thus, in particular, dimB(F) = n\
Let {e^, . . ., e„} be the basis of V dual to {^J. We first show that {/„} spans B(y). Let / G B(y)
and suppose /(ej, e^) = ay. We claim that / = ^ay/y. It suffices to show that /(e^.e,) =
(2aij/ij)(es, ej) for s,t = l,...,n. We have
(2 ay/y)(es, et) = 2ay/«(es, ej) = 2 Oy ^i(es) ^^(et)
= 2ay«isSjt = "st = f(es,et)
as required. Hence {/y} spans B{V).
It remains to show that {/y} is linearly independent. Suppose Soy/y = 0. Then for
s,t = 1,. . .,n,
= 0(es, et) = (2 ao/y)(es, Bf) = a^s
The last step follows as above. Thus {/y} is independent and hence is a basis of B(V).
12.5. Let [/] denote the matrix representation of a bilinear form / on F relative to a basis
{ei, . . .,e„) of V. Show that the mapping / i» [/] is an isomorphism of B{V) onto
the vector space of wsquare matrices.
Since / is completely determined by the scalars f(e^, ej), the mapping /*"*[/] is onetoone and
onto. It suffices to show that the mapping / l> [/] is a homomorphism; that is, that
[af+bg] = a[f] + b[g] (*)
However, for i,j = 1,. . .,n,
(af + bg){ei, e^) = «/(«{, e^) + bg(ei, Bj)
which is a restatement of (*). Thus the result is proved.
12.6. Prove Theorem 12.2: Let P be the transition matrix from one basis {e,} to another
basis {gj}. If A is the matrix of / in the original basis {Ci}, then B = P'^AP is the
matrix of / in the new basis {e\}.
Let u,v eV. Since P is the transition matrix from {e^} to {e,}, we have P[u]g, — [u]g and
P[v]e = [v]e hence [u]l = [u]l, PK Thus
f{u,v) = [u]lA[v], = [u]l.PtAP[v],.
Since u and v are arbitrary elements of V, P^ A P is the matrix of / in the basis {e^}.
SYMMETRIC BILINEAR FORMS. QUADRATIC FORMS
12.7. Find the symmetric matrix which corresponds to each of the following quadratic
polynomials:
(i) q{x, y) = 4x^  6xy  7y^ (iii) q{x, y, z) = 3x^ + 4xy  y' + 8xz — Qyz + z^
(ii) q{x, y)  xy + y^ (iv) q(x, y, z) = x^  2yz + xz
The symmetric matrix A — (uy) representing q(xi, . . .,x„) has the diagonal entry a^ equal to
the coefficient of xf and has the entries ay and ajj each equal to half the coefficient of ajjajj. Thus
CHAP. 12]
BILINEAR, QUADRATIC AND HERMITIAN FORMS
269
(ii)
12.8. For each of the following real symmetric matrices A, find a nonsingular matrix P
such that P* AP is diagonal and also find its signature:
(i) A  :
(ii) A =
(i) First form the block matrix (A, I):
{A, I) = 
Apply the row operations R2^^Ri + R2 and iZa » — 2i2i + iJj to (A,/) and then the corre
sponding column operations C^ > SCj + C^, and C3 » — 2Ci + C3 to A to obtain
1
3
2
1
0\
/I
1
2
1
3
1
and then
2
1
3
1
1
4
2
1/
\o
1
4
2
1
Next apply the row operation B3 » i?2 + ^^3 ^^^ then the corresponding column operation
C3* C2 + 2C3 to obtain
and then
then P^AP 
Now A has been diagonalized. Set P
The signature S of A is 5 = 21 = 1
(ii) First form the block matrix (A, I):
(A,/) =
In order to bring the nonzero diagonal entry —1 into the first diagonal position, apply the row
operation Ri <> R^ and then the corresponding column operation Ci «> C3 to obtain
1
1
1
1
2
2
1
1
2
1
1
1
2
1
1\
1
2
1
1
1
2
2
1
and then
2
2
1
1
1
1
1
0/
\ 1
1
1
Apply the row operations i?2 ~* 2Bi + ^2 ^nd JB3 > iJj + R^ and then the corresponding
column operations C^ * 2Ci + C^ and C3 ^ Ci + C3 to obtain
1
2
1
1\
/I
1
2
3
1
2
and then
2
3
1
2
3
1
1
1/
\
3
1
1
1
270 BILINEAR, QUADRATIC AND HERMITIAN FORMS [CHAP. 12
Apply the row operation iJg > —3^2 + 2R3 and then the corresponding column operation
C3 » 3C2 + 2C3 to obtain
/I l\ /I 1^
2 3 12 and then 2 12
\ 7 234/ \ 14 2 3 4,
/O 2\
Now A has been diagonalized. Set P = 1 3 ; then P'AP =
\l 2 4/
The signature S of 4 is the difference S = 1 — 2 = —1.
12.9. ' Suppose 1 + 1 v^ in K. Give a formal algorithm to diagonalize (under congruence)
a symmetric matrix A = (an) over K.
Case I: aii¥=0. Apply the row operations fij > — ajxi?x + OxxiJj, i = 2, . . .,n, and then the
/ill 0"
corresponding column operations Cj * — Oji Cj + an C; to reduce A to the form (
^0 ^
Case II: a^ = but a^ ^ 0, for some i > 1. Apply the row operation Ri «^i2j and then the
corresponding column operation Cj <> Q to bring ctjj into the first diagonal position. This reduces
the matrix to Case I.
Case III: All diagonal entries Oji = 0. Choose i, j such that ay ¥= 0, and apply the row opera
tion i?i > Rj + Rf and the corresponding column operation Ci* Cj + Cj to bring 2ay # into the
ith diagonal position. This reduces the matrix to Case II.
/ail 0\
In each of the cases, we can finally reduce A to the form ( . _ ) where B is a symmetric
matrix of order less than A. By induction we can finally bring A into diagonal form.
Remark: The hypothesis that 1 + 1 #^ in K, is used in Case III where we state that 2ay ¥= 0.
12.10. Let q be the quadratic form associated with the symmetric bilinear form /. Verify
the following polar form of /: fiu,v)  ^q{u + v)  q{u)  q{v)). (Assume that
1 + 1^0.)
q(u + v) — q{u) — qiv) = f(u + v,u + v) — f{u, u) — f(v, v)
= f(u, u) + f{u, v) + f{v, u) + fiy, v)  f{u, u)  f(v, v)
= 2f{u,v)
If 1 + 1 7^ 0, we can divide by 2 to obtain the required identity.
12.11. Prove Theorem 12.4: Let / be a symmetric bilinear form on V over K (in which
1 + 1^0). Then V has a basis {vi, . . .,Vn) in which / is represented by a diagonal
matrix, i.e. f{Vi, v,) = for i ¥ j.
Method 1.
If / = or if dim V — 1, then the theorem clearly holds. Hence we can suppose f ¥= Q and
dim V = n> 1. If q(v) = f(v, v)  for every v&V, then the polar form of / (see Problem 12.10)
implies that / = 0. Hence we can assume there is a vector t?i e V such that f(Vi, v^ ¥= 0. Let
U be the subspace spanned by Vi and let W consist of those vectors 1; G y for which /(^i, v) = 0.
We claim that V = U ®W.
(i) Proof that UnW = {0}: Suppose uG U nW. Since ue^U, u — kv^ for some scalar k&K.
Since uGW, = f{u,u) = f(kvi,kvi) = k^ f(Vi,Vi). But /(^i.Wi) ?^ 0; hence k0 and there
fore u = kvi = 0. Thus UnW = {0}.
CHAP. 12] BILINEAR, QUADRATIC AND HERMITIAN FORMS 271
(il) Proof that V = U + W: Let vG V. Set
Then f{v^,w) = f(v„v)  ;^^/(^i'^i) = "
Thus w G W. By (1), v is the sum of an element of U and an element of W. Thus V = U + W.
By (i) and (ii), V = U ® W.
Now / restricted to W is a symmetric bilinear form on W. But dim W — n. — 1; hence by
induction there is a basis {^2. • • • . v„} of W such that f(v^, Vj) = for i ¥= j and 2 — i, j — n. But
by the very definition of W, fiv^, Vj) = for j = 2, . . .,n. Therefore the basis {v^, . . . , v„} of V
has the required property that /(Uj, Vj) = for i ¥= j.
Method 2.
The algorithm in Problem 12.9 shows that every symmetric matrix over K is congruent to a
diagonal matrix. This is equivalent to the statement that / has a diagonal matrix representation.
12.12. Let A = I ^ 1 , a diagonal matrix over K. Show that:
(i) for any nonzero scalars fci, . . . , fc„ e If , A is congruent to a diagonal matrix
with diagonal entries aifcf ;
(ii) if K is the complex field C, then A is congruent to a diagonal matrix with only
I's and O's as diagonal entries;
(iii) if K is the real field K, then A is congruent to a diagonal matrix with only
I's, —I's and O's as diagonal entries.
(i) Let P be the diagonal matrix with diagonal entries fc^. Then
ptAP = I ^"2 11 02
02^2
OnKl
fl/Voi if «{ '^
(ii) Let P be the diagonal matrix with diagonal entries f>i — "] . •« _ r, • Then P^AP has
the required form.
fl/A/hl if Oi^O
(iii) Let P be the diagonal matrix with diagonal entries 6j = ■< _ . Then P^AP has
the required form. ^ '
Remark. We emphasize that (ii) is no longer true if congruence is replaced by Hermitian con
gruence (see Problems 12.40 and 12.41).
12.13. Prove Theorem 12.5: Let / be a symmetric bilinear form on V over R. Then there
is a basis of V in which / is represented by a diagonal matrix, and every other diagonal
representation of / has the same number of positive entries and the same number of
negative entries.
By Theorem 12.4, there is a basis {itj, . . . , u^} of V in which / is represented by a diagonal
matrix, say, with P positive and N negative entries. Now suppose {wi, . . ., w„} is another basis of
V in which / is represented by a diagonal matrix, say, with P' positive and N' negative entries. We
can assume without loss in generality that the positive entries in each matrix appear first. Since
rank (f)  P + N = P' + N', it suffices to prove that P = P'.
272 BILINEAR, QUADRATIC AND HERMITIAN FORMS [CHAP. 12
Let [7 be the linear span of u^, . .., up and let W be the linear span of Wp, + 1, . . . , m)„. Then
f{v,v) > for every nonzero v e U, and f{v,v) ^ for every nonzero v & W. Hence
UnW = {0}. Note that dimU = P and dimW = n P'. Thus
dim{U+W) = dimU + dimW  dim(UnW) = P + {nP')0 = P  P' + n
But dim(U+W) ^ dim V = n; hence PP' + n^n or P ^ P'. Similarly, P' ^ P and there
fore P = P', as required.
Remark. The above theorem and proof depend only on the concept of positivity. Thus the
theorem is true for any subfield K of the real field R.
12.14. An nxn real symmetric matrix A is said to be positive definite if X*AX > for
every nonzero (column) vector X G R", i.e. if A is positive definite viewed as a
bilinear form. Let B be any real nonsingular matrix. Show that (i) B*B is sym
metric and (ii) B*B is positive definite.
(i) {Btpy = ptP" = B<B\ hence B'JS is symmetric.
(ii) Since B is nonsingular, BX # for any nonzero X S R". Hence the dot product of BX with
itself, BXBX = (BXY(BX), is positive. Thus XHBtB)X = {XtBt){BX) = (BX)HBX) > as
required.
HERMITIAN FORMS
12.15. Determine which of the following matrices are Hermitian:
2i 4 + i\
6 i
i 3 /
(ii)
A matrix A = (Oj^) is Hermitian iff A = A*, i.e. iflf o.^ = 'ajl.
(i) The matrix is Hermitian, since it is equal to its conjugate transpose,
(ii) The matrix is not Hermitian, even though it is symmetric,
(iii) The matrix is Hermitian. In fact, a real matrix is Hermitian if and only if it is symmetric.
12.16. Let A be a Hermitian matrix. Show that / is a Hermitian form on C« where / is
defined by f(X, Y)^X<^A Y.
For all a, 6 e C and all X^, X^, Y e C",
/(aXi + 6X2, y) = (aX^ + hX^YAY = (aX\+hXl)AY
= aX\AY + bXlAY = af(Xi,Y) + bf(X2,Y)
Hence / is linear in the first variable. Also,
f{X, Y) = xTaY = (XTaYY = Wa^X = YtA*X = Yt A X = f(Y,X)
Hence / is a Hermitian form on C". (Remark. We use the fact that X* AY is a. scalar and so it is
equal to its transpose.)
12.17. Let / be a Hermitian form on V. Let H be the matrix of / in a basis {d, . . . , e„} of
V. Show that:
(i) f{u,v) = [u]lH\v]e tor al\ u,vGV;
(ii) if P is the transition matrix from {ei} to a new basis {e,'} of V, then B = P*HP
(or: BQ*HQ where Q = P) is the matrix of / in the new basis {e,'}.
Note that (ii) is the complex analog of Theorem 12.2.
(i) Let u,v GV and suppose
u = ajCi + 0362 + • • • + a„e„ and v — b^ei + 62^2 + • • • + &«««
CHAP. 12]
BILINEAR, QUADRATIC AND HERMITIAN FORMS
273
Then f{u, v) = fia^Ci + • • • + a„e„, 6161 + • • ■ + 6„e„)
as required. \ "
(ii) Since P is the transition matrix from {ej to {e^}, then
P[u],. = [u]„ P[v],, = [v], and so [m]^ = [m]*, P', [^= P ['^
Thus by (i), f(u,v) = {u]\H [v]^ — [u\l,P^HP [v]^,. But u and v are arbitrary elements of V;
hence P* H P is the matrix of / in the basis {el}.
1 1 + i 2i \
12.18. Let H = \ 1  i 4 2 — 3i,a Hermitian matrix. Find a nonsingular ma
2i 2 + Si 1 j
trix P such that P*HP is diagonal.
First form the block matrix {H, I):
1 1 + i 2i
1  j 4 2  3i
2i 2 + 3i 7
Apply the row operations R2 * (—1 + 'i)Ri + R2 ^^^ ^3 ^ 2i/2i + i?3 to (A, /) and then the
corresponding "Hermitian column operations" (see Problem 12.42) C2 > (—1 — t)Ci + C2 and
C3 » — 2iCi + Cg to A to obtain
and then
Next apply the row operation R^ > — SiBg + Zi^a and the corresponding Hermitian column opera
tion C3 * hiC^ + 2C3 to obtain
and then
Now H has been diagonalized. Set
'1 1 + i 5 + 9i^
P = I 1 hi
,0 2
and then P^HP
Observe that the signature S of H is jS = 2 — 1 = 1.
MISCELLANEOUS PROBLEMS
12.19. Show that any bilinear form / on F is the sum of a symmetric bilinear form and a
skew symmetric bilinear form.
Set g(v,,v) = ^[f{u,v) + f(v,u)\ and h{u,v) = ^[f(u,v) — f(v,u)\. Then g is symmetric because
g{u,v) = ^[f(u,v) + f(v,u)] = ^[f{v,u) + f{u,v)] = g(v,u)
and /i is skew symmetric because
h{u,v) = ^[f{u,v) — f{v,u)] = ^[/(v.m) /(«,!))] = h(v,u)
Furthermore, f — g + h.
274 BILINEAR, QUADRATIC AND HERMITIAN FORMS [CHAP. 12
12.20. Prove Theorem 12.3: Let / be an alternating bilinear form on V. Then there exists
a basis of V in which / is represented by a matrix of the form
1 I
1 I
110 1
L_oj
T,
To"
1
Moreover, the number of ( a M^ uniquely determined by / (because it is equal
to i[rank (/)]). \'^ "/
If / = 0, then the theorem is obviously true. Also, if dim y = 1, then fik^u, fcaw) =
fcifcj f(u, m) = and so / = 0. Accordingly we can assume that dim F > 1 and f ¥ 0.
Since / # 0, there exist (nonzero) Wj, Wg G 7 such that /(mj, u^) # 0. In fact, multiplying
Ml by an appropriate factor, we can assume that f{ui, u^ = 1 and so /(m2, "i) = —1 Now Ui and
M2 are linearly independent; because if, say, u^ = fewj, then /(mj, u^) = /(mi, ku^) = k f(ui, u^) = 0.
Let C7 be the subspace spanned by Ml and M2, i.e. U = L{ux,u^. Note: / n 1
(i) the matrix representation of the restriction of / to ?7 in the basis {mi, Mg} is ( _ ^ ^
(ii) if uG U, say u = aui + 6M2, then
/(m. Ml) = /(aMi + 6m2, Ml) = —6
f(u, M2) = f(aui + bu2, M2) = "'
Let TF consist of those vectors wGV such that /(w,Mi) = and /(w,M2) = 0. Equivalently,
W = {wGV: f(w, m) = for every uGU}
We claim that V=U®W. It is clear that UnW = {0}, and so it remains to show that
V = U+W. Let veV. Set
M = f{v, u^uy  /(■«, Mi)m2 and w = v u U)
Since M is a linear combination of Mi and Mj, uGU. We show that weW. By (i) and (ii),
f(u,U]) = f(v,Ui); hence
/(W, Ml) = f(v  U, Ml) = fiv, Ml)  /{m. Ml) =
Similarly, /(m, M2) = f(v, Mg) and so
f{w, M2) = /(f  M, M2) = /(v, U2)  /(m, M2) =
Then w G TF and so, by (i), v = m + w where m G f/ and w&W. This shows that V^U+W\
and therefore y = ?7 © TF.
Now the restriction ot f to W is an alternating bilinear form on W. By induction, there exists
a basis M3, . . .,m„ of IF in which the matrix representing / restricted to W has the desired form.
Thus Mi,M2,M3,. . .,M„ is a basis of V in which the matrix representing / has the desired form.
(ii) Find the matrix of / in the basis '' ^ °^ (^ ^^ 1^ °^ 1^ °
CHAP. 12] BILINEAR, QUADRATIC AND HERMITIAN FORMS 275
Supplementary Problems
BILINEAR FORMS
12.21. Let M = (xi, x^ and v = (i/x, y^. Determine which of the following are bilinear forms on R^:
(i) /(m, V) = 2a;i2/2  SajjJ/i (iv) f(u, v) = X1X2 + j/iJ/2
(ii) f(u, v) = xi + 3/2 (v) f{u, v) = 1
(iii) /(m,v) = 3a;2V2 (vi) /(w, v) = 0.
12.22. Let / be the bilinear form on R2 defined by
/((«!, X2), (vi, 2/2)) = SxiVi  2x12/2 + 4x2yi  a;2?/2
(i) Find the matrix A of / in the basis {u^ = (1, 1), Mj = (1, 2)}.
(ii) Find the matrix B of / in the basis {v^ = (1, —1), V2 = (3, 1)}.
(iii) Find the transition matrix P from {mJ to {vj and verify that B = P* A P.
12.23. Let V be the vector space of 2 X 2 matrices over B. Let Af = ( ^ j, and let f{A,B) =
tr (At MB) where A,B&V and "tr" denotes trace, (i) Show that / is a bilinear form on V.
'1 0\ /O 1\ /o 0\ /O 0^
o)'\o oj' [1 o)'[q 1,
12.24. Let B{V) be the set of bilinear forms on V over K. Prove:
(i) if f,gGB(V), then f + g and kf, for k G K, also belong to BiV), and so B(V) is a subspace
of the vector space of functions from V X V into K;
(ii) if and a are linear functionals on V, then f{u, v) = ,p{u) a(v) belongs to B{V).
12.25. Let / be a bilinear form on V. For any subset S of V, we write
S* = {ve^V: f{u,v) = for every mSS}, S""" = {vGV : f{v,u) = foreverywGS}
Show that: (i) S"*" and S^ are subspaces of V; (ii) SjCSg implies S2 c st and sj C s} ;
(iii) {0}L ={0}T =y.
12.26. Prove: If / is a bilinear form on V, then rank (/) = dim y  dimy*" = dim y  dimy^ and
hence dim y*" = dim y^.
12.27. Let / be a bilinear form on V. For each uGV, let u .V ^ K and u iV * K be defined by
u(x) = f(x,u) and u{x)=f(u,x). Prove:
As. >^ ^ ^
(1) u and u are each linear, i.e. u,u G V*;
(ii) u h* u and m M it are each linear mappings from V into V*;
(iii) rank (/) = rank (m K m) = rank {u K tt).
12.28. Show that congruence of matrices is an equivalence relation, i.e. (i) A is congruent to A; (ii) if A
is congruent to B, then B is congruent to A; (iii) if A is congruent to B and B is congruent to C,
then A is congruent to C.
SYMMETRIC BILINEAR FORMS. QUADRATIC FORMS
12.29. Find the symmetric matrix belonging to each of the following quadratic polynomials:
(i) q(x, y, z) = 2x^  Sxy + y^  IGxz + liyz + 5«2
(ii) q(x, y, z) = x^ — xz + y^
(iii) q(x, y, z) — xy + y^ + 4xz + z^
(iv) q(x, y, z) = xy + yz.
276 BILINEAR, QUADRATIC AND HERMITIAN FORMS [CHAP. 12
12.30. For each of the following matrices A, find a nonsingular matrix P such that P'^AP is diagonal:
1 123^
2 3\ / I ! ^ \ (1251
25 6 9
3 1 9 11
(i) ^ = (3 4)' («) A = 2 6 9 . (iii) A = , 2 _g g
In each case find the rank and signature.
12.31. Let q be the quadratic form associated with a symmetric bilinear form /. Verify the following
alternate polar form of /: f(u,v) = ^[q(u + v) — q(u — v)].
12.32. Let S(V) be the set of symmetric bilinear forms on V. Show that:
(i) S{V) is a subspace of B(V); (ii) if dim V = n, then dim S(y) = ^n(n + 1).
12.33. Let / be the symmetric bilinear form associated with the real quadratic form qix,y) =
ax^ + bxy + cy^. Show that:
(i) / is nondegenerate if and only if b^ — 4ac ¥= 0;
(ii) / is positive definite if and only if a > and b^ — 4ac < 0.
12.34. Suppose A is a real symmetric positive definite matrix. Show that there exists a nonsingular matrix
P such that A = P«P.
n
12.35. Consider a real quadratic polynomial qix^, . . . , a;„) = 2 Oij^i^j, where ay = ttjj.
ijj = 1
(i) If an '^ 0, show that the substitution
Xi = Vi (ai22/2 + • • ■ + «ln2/n). «2 = 2/2. ••■' «« = 2/n
aji
yields the equation q{xi x^) = a^yl + q'iVz, ■ ■ yVn), where q' is also a quadratic
polynomial.
(ii) If an = but, say, ajg ^ 0, show that the substitution
xiyi + 2/2. X2 = yi V2> «3 = 2/3. • • • . ^n = Vn
yields the equation q{xi,...,x„) = 2 Mi^/j "^^^^^ ^n ^^ 0. i.e. reduces this case to case (i).
This method of diagonalizing q is known as "completing the square".
12.36. Use steps of the type in the preceding problem to reduce each quadratic polynomial in Problem
12.29 to diagonal form. Find the rank and signature in each case.
HERMITIAN FORMS
12.37. For any complex matrices A, B and any k e C, show that:
(i) ATB = A+B, (ii) M^JcA, (iii) AB = AB, (iv) A* = A^.
12.38. For each of the following Hermitian matrices H, find a nonsingular matrix P such that P* H P is
diagonal: , . ^^.,
Find the rank and signature in each case.
12.39. Let A be any complex nonsingular matrix. Show that H = A*A is Hermitian and positive
definite.
12.40. We say that B is Hermitian congruent to A if there exists a nonsingular matrix Q such that
B = Q*AQ. Show that Hermitian congruence is an equivalence relation.
CHAP. 12] BILINEAR, QUADRATIC AND HERMITIAN FORMS 277
12.41. Prove Theorem 12.7: Let / be a Hermitian form on V. Then there exists a basis {e^, . . .,e„} of V
in which / is represented by a diagonal matrix, i.e. /(ej, ej) = for t # j. Moreover, every diagonal
representation of / has the same number P of positive entries and the same number N of negative
entries. (Note that the second part of the theorem does not hold for complex symmetric bilinear
forms, as seen by Problem 12.12(ii). However, the proof of Theorem 12.5 in Problem 12.13 does
carry over to the Hermitian case.)
MISCELLANEOUS PROBLEMS
12.42. Consider the following elementary row operations:
[ai] Bi <^ Rj, [oj Ri * kRi, k ¥ 0, [og] jB; ^ kRj + fij
The corresponding elementary column operations are, respectively,
Ih] Ci^Cj, [bzl Ci^kCi,ky^O, [bs] d^kCj + Ct
If K is the complex field C, then the corresponding Hermitian column operations are, respectively,
[ci] Ci'^Ci, [cj Cj^feCj, fc^O, [cs] Ci*kCj + Ci
(i) Show that the elementary matrix corresponding to [6J is the transpose of the elementary
matrix corresponding to [a^].
(ii) Show that the elementary matrix corresponding to [cj is the conjugate transpose of the ele
mentary matrix corresponding to [aj.
12.43. Let V and W be vector spaces over K. A mapping / : VxW^ K is called a bilinear form on V
and W if:
(i) f(avi + bv2, w) = af(vi, w) + bfiv^, w)
(ii) f(v, owj + hWi) = af(v, Wj) + bf^v, w^
for every a,bQK, V; G V, w^ £ W. Prove the following:
(i) The set B(V, W) of bilinear forms on V and W is a subspace of the vector space of functions
from VXW into K.
(ii) If {^1, ...,0^} is a basis of V* and {aj, ...,»„} is a basis of W*, then {/y : i=l,...,m,
j = l,...,n} is a basis of B(V,W) where /y is defined by fij(v,w)  4>i(v) aAw). Thus
dim B(V, W) = dim F • dim W.
{Remark. Observe that if V = W, then we obtain the space B{V) investigated in this chapter.)
m times
12.44. Let y be a vector space over K. A mapping f: VXVXxV*K is called a multilinear (or:
mlinear) form on V if / is linear in each variable, i.e. for i = l, ...,m,
f{...,au+bv, ...) = af(...,u, ...) + bf(...,v, ...)
where ^ denotes the ith component, and other components are held fixed. An »ilinear form / is
said to be alternating if
f{vi, . . . , Vf^) = whenever Vj = v^, i^^ k
Prove:
(i) The set B^(V) of mlinear forms on V is a subspace of the vector space of functions from
VXVX ■■■ XV into K.
(ii) The set A,„(y) of alternating mlinear forms on V is a subspace of B^{V).
Remark 1. If m = 2, then we obtain the space B(F) investigated in this chapter.
Remark 2. If. V = K'", then the determinant function is a particular alternating mlinear form on V.
278
BILINEAR, QUADRATIC AND HERMITIAN FORMS
[CHAP. 12
Answers to Supplementary Problems
12.21. (i) Yes (ii) No (iii) Yes (iv) No (v) No (vi) Yes
12.22. (i) A =
4 1
7 3
(ii) B
4
20 32
(iii) P =
3 5
2 2
12.23. (ii)
2 4 8'
12.29. (i) 1417
18 7 5,
(ii)
1 i^
10
i 0;
'0
(iii)
i 2'
i 1
2 1/
'o i o'
(iv) U 1
.0 4 0,
12.30. (i) P =
(ii) P 
(iii) P =
1 3
2
PtAP
2
2
S = 0.
'1 2 0\ /^ ° °
1 3 , PtAP =02
2/ loo 38 y
S = 1.
/l 1 1 26 \
1 3 13
19
\o 7
PtAP =
S = 2.
12.38. (i) P =
(ii) P =
(iii) P =
1 A ptHP = ^1 »
1
1 2 + 3i
1
1 i 3 + i
1 i
1
1
PtHP =
PtHP
S = 2.
14
'1 0'
10
lO 04,
S = 1.
chapter 13
Inner Product Spaces
INTRODUCTION
The definition of a vector space V involves an arbitrary field K. In this chapter we
restrict K to be either the real field R or the complex field C. In the first case we call V a
real vector space, and in the second case a complex vector space.
Recall that the concepts of "length" and "orthogonality" did not appear in the investiga
tion of arbitrary vector spaces (although they did appear in Chapter 1 on the spaces R" and
C"). In this chapter we place an additional structure on a vector space V to obtain an
inner product space, and in this context these concepts are defined.
We emphasize that V shall denote a vector space of finite dimension unless otherwise
stated or implied. In fact, many of the theorems in this chapter are not valid for spaces of
infinite dimension. This is illustrated by some of the examples and problems.
INNER PRODUCT SPACES
We begin with a definition.
Definition: Let F be a (real or complex) vector space over K. Suppose to each pair of
vectors u,v GV there is assigned a scalar {u, v) G K. This mapping is called
an inner product in V if it satisfies the following axioms:
[/i] {aui, + hu2, V) = a(ui, v) + h{U2, v)
[h] {u,v) = (v/u)
[la] {u, u) s= 0; and {u, m) = if and only if u = 0.
The vector space V with an inner product is called an inner product space.
Observe that {u,u) is always real by [72], and so the inequality relation in [h] makes
sense. We also use the notation
m = '\/{u, u)
This nonnegative real number m is called the norm or length of u. Also, using [/i] and
[/2] we obtain (Problem 13.1) the relation
{u, avi + bv2) = d{u, Vi) + b{u, V2)
If the base field K is real, the conjugate signs appearing above and in [/2] may be ignored.
In the language of the preceding chapter, an inner product is a positive definite sym
metric bilinear form if the base field is real, and is a positive definite Hermitian form if the
base field is complex.
A real inner product space is sometimes called a Euclidean space, and a complex inner
product space is sometimes called a unitary space.
279
280
INNER PRODUCT SPACES
[CHAP. 13
Example 13.1 : Consider the dot product in R":
M • v = ajfei + a262 + • • • + Ok6„
where u — (aj) and v = (6;). This is an inner product on R", and R" with this
inner product is usually referred to as Euclidean nspace. Although there are
many different ways to define an inner product on R" (see Problem 13.2), we shall
assume this inner product on R" unless otherwise stated or implied.
Example 13.2:
Example 13.3:
Example 13.4:
Example 13.5:
Consider the dot product on C":
where u — (Zj) and v = (Wj). As in the real case, this is an inner product on C"
and we shall assume this inner product on C" unless otherwise stated or implied.
Let V denote the vector space oi mXn matrices over R. The following is an inner
product in V:
(A,B) = tr (B*A)
where tr stands for trace, the sum of the diagonal elements.
Analogously, if U denotes the vector space of m X n matrices over C, then the
following is an inner product in U:
{A,B} = tr(B*A)
As usual, B* denotes the conjugate transpose of the matrix B.
Let V be the vector space of real continuous functions on the interval a — t — b.
Then the following is an inner product on V:
{f,g) = f m)g(t)dt
•'a
Analogously, if V denotes the vector space of complex continuous functions on the
(real) interval a  t  h, then the following is an inner product on U:
if, 9) = f m^dt
•'a
Let V be the vector space of infinite sequences of real numbers (oj, a2, . . .) satisfying
2,2
2 of = ai + a2 +
t=i
< M
i.e. the sum converges. Addition and scalar multiplication are defined component
wise:
(tti.aa, ...) + (61,62. • ••) = (011 + 61,02+62,...)
fe(ai, a2, . ■ .) = (.kai, ka2, ■ ■ ■)
An inner product is defined in V by
<(«!, tta, ...), (61, 62, ... )) = ai6i + a2&2 + ' ■ '
The above sum converges absolutely for any pair of points in V (Problem 13.44);
hence the inner product is well defined. This inner product space is called J2space
(or: Hilbert space).
Remark 1: If f = 1, i.e. if {v,v) = 1, then v is called a unit vector or is said to be
normalized. We note that every nonzero vector u &V can be normalized by
setting V = m/%.
Remark 2: The nonnegative real number d{u,v) = \\vu\\ is called the distance between
u and v; this function does satisfy the axioms of a metric space (see Problem
13.51).
CHAP. 13]
INNER PRODUCT SPACES
281
CAUCHYSCHWARZ INEQUALITY
The following formula, called the CauchySchwarz inequality, is used in many branches
of mathematics.
Theorem 13.1: (CauchySchwarz): For any vectors u,vGV,
\{u,v)\ — m II^II
Next we examine this inequality in specific cases.
Example 13.6: Consider any complex numbers aj, . . .,a„, 6i, . . ., 6„ G C. Then by the Cauchy
Schwarz inequality,
(ai6i++a„6„P =^ iWi\^ + ■ ' ■ + KMh]^ + ■ ■ ■ + K\^)
that is, (mv)2 ^ Ikll^llulP
where u = (aj) and v = (6;).
Example 13.7; Let / and g be any real continuous functions defined on the unit interval — t — 1.
Then by the CauchySchwarz inequality,
«/,ff»2 = (^f f{t)g(t)dty =s ^^ P(t)dt ^\\t)dt = \W\\g\\^
Here V is the inner product space of Example 13.4.
ORTHOGONALITY
Let V be an inner product space. The vectors u,v GV are said to be orthogonal if
{u/v) = 0. The relation is clearly symmetric; that is, if u is orthogonal to v, then {v, u) =
{u,v} = = and so v is orthogonal to u. We note that e V is orthogonal to every
V e 7 for
{0,v) ^ {Qv,v) = 0{v,v) =
Conversely, if u is orthogonal to every v GV, then (u.u) = and hence m = by [1 3].
Now suppose W is any subset of V. The orthogonal complement of W, denoted by W^
(read "W perp") consists of those vectors in V which are orthogonal to every w GW:
W' = {v GV: (v,w) = for every w GW}
We show that W^ is a subspace of V. Clearly, GW^. Now suppose u,v G W^. Then
for any a,h GK and any w GW,
{au + bv, w) = a{u, w) + h{v, w)  a0 + b'0 =
Thus au + bv €.
Theorem 13i!:
W and therefore W is a subspace of V.
Let W he a subspace of V. Then V is the direct sum of W and W^ ,
v^w®w^.
I.e.
Now if PF is a subspace of V, then V =W ® W^ by the above theorem; hence there is
a unique projection Ew: F > 7 with image W and kernel
W^ . That is, if v GV and v = w + w', where w GW,
w' G W^, then Ew is defined by Ew{v) = w. This mapping
Evf is called the orthogonal projection of V onto W.
Example 13.8: Let W be the z axis in B^, i.e.
W = {(0,0,c): cSR}
Then W is the xy plane, i.e.
W^ = {{a, 6, 0) : a, 6 e R}
As noted previously, R^ = W ® W . The
orthogonal projection E of R^ onto W is given
by E{x, y, z) = (0, 0, z).
282 INNER PRODUCT SPACES [CHAP. 13
Example 13.9: Consider a homogeneous system of linear equations over R:
ffliia;! + fflijaig + • • • + »!„»;„ =
a2iXi + 022*2 + • • • + a2n«7i =
ttml*! + «m2«'2 +
or in matrix notation AX = 0. Recall that the solution space W may be viewed
as the kernel of the linear operator A. We may also vievs^ W as the set of all vectors
V = (xi, ...,«„) vrhich are orthogonal to each row of A. Thus W is the orthogonal
complement of the row space of A. Theorem 13.2 then gives another proof of the
fundamental result: dim W = n — rank (A).
Remark: If F is a real inner product space, then the angle between nonzero vectors
u,v GV is defined by
(u, V)
cos 6 
\\u\\m
By the CauchySchwarz inequality, 1 =^ cos ^ ^ 1 and so the angle 6 always
exists. Observe that u and v are orthogonal if and only if they are "perpendicu
lar", i.e. e  7r/2.
ORTHONORMAL SETS
A set {Ui} of vectors in V is said to be orthogonal if its distinct elements are orthogonal,
i.e. if {Vd, Uj) = for i¥= j. In particular, the set {Ui} is said to be orthonormal if it is
orthogonal and if each Ui has length 1, that is, if
r for i¥' j
{Ui, Uj) = Sij = < . .
[1 for I — J
An orthonormal set can always be obtained from an orthogonal set of nonzero vectors by
normalizing each vector.
Example 13.10: Consider the usual basis of Euclidean 3space R^:
{ei = (1, 0, 0), 62 = (0. 1. 0). «3 = (0. 0, 1)}
It is clear that
(ei, ej)  (62, 62) = (63, 63) = 1 and (fij, e^) = for i¥=j
That is, {ei, e^, 63} is an orthonormal basis of R^. More generally, the usual basis
of R" or of C" is orthonormal for every n.
Example 13.11: Let V be the vector space of real continuous functions on the interval vtv
with inner product defined by {f,g)= I f(t) g{t) dt. The following is a classi
cal example of an orthogonal subset of V:
{1, cos t, cos 2«, . . ., sin t, sin 2t, . . .}
The above orthogonal set plays a fundamental role in the theory of Fourier series.
The following properties of an orthonormal set will be used in the next section.
Lemma 13^: An orthonormal set (mi, ...,Ur}\s linearly independent and, for any v&V,
the vector ^^ _ ^ _ ^^^ ^^^^^ _ ^^^ ^2)^2  ■■■  {v, Ur)Ur
is orthogonal to each of the Uu
CHAP. 13] INNER PRODUCT SPACES 283
GRAMSCHMIDT ORTHOGONALIZATION PROCESS
Orthonormal bases play an important role in inner product spaces. The next theorem
shows that such a basis always exists; its proof uses the celebrated GramSchmidt orthog
onalization process.
Theorem 13.4: Let {vi, . . .,Vn} be an arbitrary basis of an inner product space V. Then
there exists an orthonormal basis {ui, . . . , tt„} of V such that the transition
matrix from {Vi} to {%} is triangular; that is, for i = l, . . .,n,
Ui = OiiVi + 012^2 + ■ • • + auVi
Proof. We set Ui = Vi/t;i; then {ui} is orthonormal. We next set
W2 = V2 — {V2,Ui)Ui and M2 = W2/w2
By Lemma 13.3, W2 (and hence U2) is orthogonal to Ui; then {ui,v^} is orthonormal. We next
ggf
W3 = va — {V3,Ui)Ui — {vs,U2)U2 and Uz = Ws/\\ws\\
Again, by Lemma 13.3, Wa (and hence Ua) is orthogonal to Ui and U2; then {ui, iia, Ua} is ortho
normal. In general, after obtaining {ui, ...,2*1} we set
Wi+i = Vi+i — {Vi+i,ui)ui — ••■ — (Vi+1, 1^)111 and th+i = Wi+i/\\Wi+i\\
(Note that Wi+i¥'0 because Vi+i€ L{vi, . . .,Vi) — L{ui, . . .,ik).) As above, (tti, . . .,Mi+i}
is also orthonormal. By induction we obtain an orthonormal set {ui, . . . , ztn} which is in
dependent and hence a basis of V. The specific construction guarantees that the transition
matrix is indeed triangular.
Example 13.12: Consider the following basis of Euclidean space R^:
{^1 = (1, 1, 1), V2 = (0, 1, 1), ■V3 = (0, 0, 1)}
We use the GramSchmidt orthogonalization process to transform {vj into an ortho
normal basis {Ui). First we normalize v^, i.e. we set
JVi_ ^ (1,1,1) ^ /J 1_ _1_
Next we set
U,2 = .2<^.%)% = (0,1,1) j=(^j=.^,jl'j = (f,,
and then we normalize Wg, i.e. we set
W2 _ / 2 1_ J_
"^ ~ IKII ~ [ Ve'Ve'Ve
Finally we set
W3 = '"a  <'>'3.Ml>Ml  {'«3,U2)U2
= (0,0,1)— (^^,—,^j^^.^,j = (o,,
and then we normalize Wg:
_ Wa I 11
\/2 y/2
The required orthonormal basis of R^ is
LINEAR FUNCTIONALS AND ADJOINT OPERATORS
Let V be an inner product space. Each u&V determines a mapping u:V^K defined by
u{v) — {v,u)
284 INNER PRODUCT SPACES [CHAP. 13
Now for any a,b G K and any Vi, V2 G V,
u(avi + bv2) — {avi + bv2,u) = a{vi,u} + b{V2,u) = au{vi) + bu{v2)
That is, M is a linear functional on V. The converse is also true for spaces of finite dimen
sion and is an important theorem. Namely,
Theorem 13.5: Let ^ be a linear functional on a finite dimensional inner product space V.
Then there exists a unique vector uGV such that (j>{v) = (v, u) for every
v&V.
We remark that the above theorem is not valid for spaces of infinite dimension (Problem
13.45), although some general results in this direction are known. (One such famous result
is the Riesz representation theorem.)
We use the above theorem to prove
Theorem 13.6: Let T be a linear operator on a finite dimensional inner product space V.
Then there exists a unique linear operator T* on F such that
{T{u),v) = {u,T*{v))
for every m, v G V. Moreover, if A is the matrix of T relative to an
orthonormal basis {Ci] of V, then the conjugate transpose A* of A is the
matrix of T* in the basis {ei}.
We emphasize that no such simple relationship exists between the matrices representing
T and T* if the basis is not orthonormal. Thus we see one useful property of orthonormal
bases.
Definition: A linear operator T on an inner product space V is said to have an adjoint
operator T* on V if {T{u),v) = {u,T*{v)) for every u,vGV.
Thus Theorem 13.6 states that every operator T has an adjoint if V has finite dimension.
This theorem is not valid if V has infinite dimension (Problem 13.78).
Example 13.13: Let T be the linear operator on C^ defined by
T(x, y, z) = {2x + iv,y Mz, x + {li)y + Sz)
We And a similar formula for the adjoint T* of T. Note (Problem 7.3) that the
matrix of T in the usual basis of C^ is
/2 i
[r] = 1 5i
\1 1t 3
Recall that the usual basis is orthonormal. Thus by Theorem 13.6, the matrix of T*
in this basis is the conjugate transpose of [T]:
[T*] =
Accordingly,
T*(x,y,z) = (2x + z,ix + y + {l + i)z,5iy + Bz)
The following theorem summarizes some of the properties of the adjoint.
Theorem 13.7: Let S and T be linear operators on V and let kGK. Then:
(i) (S + T)* = S* + T* (iii) (ST)* = T*S*
(ii) (kT)* = kT* (iv) (T*)* = T
CHAP. 13]
INNER PRODUCT SPACES
285
ANALOGY BETWEEN A{V) AND C, SPECIAL OPERATORS
Let A(V) denote the algebra of all linear operators on a finite dimensional inner product
space V. The adjoint mapping T^^ T* on A{V) is quite analogous to the conjugation map
ping zi* z on the complex field C. To illustrate this analogy we identify in the following
table certain classes of operators T G A(V) whose behavior under the adjoint map imitates
the behavior under conjugation of familiar classes of complex numbers.
Class of
complex numbers
Behavior under
conjugation
Class of
operators in A(V)
Behavior under
the adjoint map
Unit circle (z = 1)
z  1/z
Orthogonal operators (real case)
Unitary operators (complex case)
r* = ri
Real axis
z = z
Self ad joint operators
Also called:
symmetric (real case)
Hermitian (complex case)
T* = T
Imaginary axis
z — —z
Skewadjoint operators
Also called:
skewsymmetric (real case)
skewHermitian (complex case)
r* =: T
Positive half axis
(0,«)
z — WW, w =/=
Positive definite operators
T = S*S
with S nonsingular
The analogy between these classes of operators T and complex numbers z is reflected in
the following theorem.
Theorem 13.8: Let A be an eigenvalue of a linear operator T on V.
(i) If T* = T\ then A. = 1.
(ii) If T* = T, then k is real.
(iii) If T* = —T, then X is pure imaginary.
(iv) If r = S*S with S nonsingular, then X is real and positive.
We now prove the above theorem. In each case let v be a nonzero eigenvector of T
belonging to X, that is, T{v)  Xv with v y^ 0; hence {v, v) is positive.
Proof of (i): We show that XX{v,v)  {v,v):
XX{v,v) = {XV, XV) ^ {T(v),T{v)) = {v,T*T{v)) = {v,I(v)) = {v,v)
But {v,v) ¥ 0; hence XX =1 and so A. = 1.
Proof of (ii): We show that X{v, v) = X{v, v):
X{v,v) = {XV, v) = {T{v),v) = {v,T*{v)) = {v,T{v)) = {v,Xv) = X{v,v)
But {v, v) ¥ 0; hence A = X and so X is real.
Proof of (iii): We show that X{v,v) = —X{v,v):
X{v,v) = {XV, V) = {T{v),v) = {v,T*{v)) = {v,T{v)) = {v,Xv) = X{v,v}
But {v,v} ¥ 0; hence X = —X or A. = —A, and so A is pure imaginary.
286 INNER PRODUCT SPACES [CHAP. IS
Proof of (iv): Note first that S{v)¥'0 because S is nonsingular; hence {S{v),S{v)) is
positive. We show that X{v,v) = {S{v),S(v)y.
\{v,v) = {\v,v) = {T{v),v) = {S*S{v),v} = {S{v),S{v))
But {v,v) and {S{v),S{v)) are positive; hence A is positive.
We remark that all the above operators T commute with their adjoint, that is,
TT* = T*T. Such operators are called normal operators.
ORTHOGONAL AND UNITARY OPERATORS
Let U he a. linear operator on a finite dimensional inner product space V. As defined
above, if
U* = C/1 or equivalently UU* = U*U = /
then Z7 is said to be orthogonal or unitary according as the underlying field is real or com
plex. The next theorem gives alternate characterizations of these operators.
Theorem 13.9: The following conditions on an operator U are equivalent:
(i) [/* = U\ that is, UU* ^U*U = I.
(ii) U preserves inner products, i.e. for every v,w GV,
{U{v),U{w)) = {v,w)
(iii) U preserves lengths, i.e. for every v &V, \[U{v)\\ = t;.
Example 13.14: Let T : RS ^ R3 be the linear operator which ^ T(v)
rotates each vector about the z axis by a fixed
angle e:
T(x, y, z)  {x cose — y sin e,
X sine + y cos 9, z)
Observe that lengths (distances from the ori
gin) are preserved under T. Thus T is an
orthogonal operator.
Example 13.15: Let V be the ^space of Example 13.5. Let TiV^V be the linear operator de
fined by r(cii,a2. ■ ••) = (0, ai,<i2. ■•■)• Clearly, T preserves inner products and
lengths. However, T is not surjective since, for example, (1, 0, 0, ... ) does not belong
to the image of T; hence T is not invertible. Thus we see that Theorem 13.9 is not
valid for spaces of infinite dimension.
An isomorphism from one inner product space into another is a bijective mapping
which preserves the three basic operations of an inner product space: vector addition,
scalar multiplication, and inner products. Thus the above mappings (orthogonal and
unitary) may also be characterized as the isomorphisms of V into itself. Note that such a
mapping U also preserves distances, since
\\U{v)Uiw)\\ = \\U{vw)\\ = \\vw\\
and so U is also called an isometry.
ORTHOGONAL AND UNITARY MATRICES
Let C7 be a linear operator on an inner product space V. By Theorem 13.6 we obtain the
following result when the base field K is complex.
Theorem 13.10A: A matrix A with complex entries represents a unitary operator U
(relative to an orthonormal basis) if and only if A* = A'^.
On the other hand, if the base field K is real then A* = A'; hence we have the follow
ing corresponding theorem for real inner product spaces.
CHAP. 13] INNER PRODUCT SPACES 287
Theorem 13.10B: A matrix A with real entries represents an orthogonal operator U
(relative to an orthonormal basis) if and only if A* = A~^.
The above theorems motivate the following definitions.
Definition: A complex matrix A for which A* = A"S or equivalently A A* = A*A = /,
is called a unitary matrix.
Definition: A real matrix A for which A* — A~^, or equivalently AA* = A*A = I, is
called an orthogonal matrix.
Observe that a unitary matrix with real entries is orthogonal.
Example 13.16: Suppose A = ( ' ^ ) is a unitary matrix. Then A A* = I and hence
AA* = /"' "A/^i *i\ = / l^il' + l^zP ttifti + ajftaA ^ /l o' ^ ^
6i h^\a2 62/ \aj6i + a262 6iP+62l^/ \0 1
Thus
ai2 + laaP = 1, 6i2 + Ibgp = 1 and aj)^ + a^z =
Accordingly, the rows of A form an orthonormal set. Similarly, A*A = I forces
the columns of A to form an orthonormal set.
The result in the above example holds true in general; namely,
Theorem 13.11: The following conditions for a matrix A are equivalent:
(i) A is unitary (orthogonal).
(ii) The rows of A form an orthonormal set.
(iii) The columns of A form an orthonormal set.
Example 13.17: The matrix A representing the rotation T in Example 13.14 relative to the usual
basis of R3 is
(cos e — sin B
sin e cos 9
1,
As expected, the rows and the columns of A each form an orthonormal set; that is,
A is an orthogonal matrix.
CHANGE OF ORTHONORMAL BASIS
In view of the special role of orthonormal bases in the theory of inner product spaces,
we are naturally interested in the properties of the transition matrix from one such basis
to another. The following theorem applies.
Theorem 13.12: Let {ei, ...,««} be an orthonormal basis of an inner product space Y.
Then the transition matrix from {ei) into another orthonormal basis is
unitary (orthogonal). Conversely, if P = (Oij) is a unitary (orthogonal)
matrix, then the following is an orthonormal basis:
{el = aiiCi + 02162 + ■ • • + a„ie„ : i = 1, . . . , n}
Recall that matrices A and B representing the same linear operator T are similar, i.e.
B — P~^AP where P is the (nonsingular) transition matrix. On the other hand, if V is
an inner product space, we are usually interested in the case when P is unitary (or orthog
onal) as suggested by the above theorem. (Recall that P is unitary if P* = P""S and P is
orthogonal if P* = p\) This leads to the following definition.
288 INNER PRODUCT SPACES [CHAP. 13
Definition: Complex matrices A and B are unitarily equivalent if there is a unitary matrix
P for which B  P*AP. Analogously, real matrices A and B are orthogonally
equivalent if there is an orthogonal matrix P for which B = P*AP.
Observe that orthogonally equivalent matrices are necessarily congruent (see page 262).
POSITIVE OPERATORS
Let P be a linear operator on an inner product space V. P is said to be positive (or:
semidefinite) if „ „ „ , rt
P = S*S for some operator &
and is said to be positive definite if S is also nonsingular. The next theorems give alternate
characterizations of these operators.
Theorem 13.13 A: The following conditions on an operator P are equivalent:
(i) P = T^ for some self adjoint operator T.
(ii) P = S*S for some operator S.
(iii) P is self adjoint and (P(m), u)^0 for every uGV.
The corresponding theorem for positive definite operators is
Theorem 13.13B: The following conditions on an operator P are equivalent:
(i) PT^ for some nonsingular self ad joint operator T.
(ii) P  S*S for some nonsingular operator S.
(iii) P is self adjoint and {P{u),u) > for every w ^ in V.
DIAGONALIZATION AND CANONICAL FORMS IN EUCLIDEAN SPACES
Let r be a linear operator on a finite dimensional inner product space V over K. Rep
resenting r by a diagonal matrix depends upon the eigenvectors and eigenvalues of T,
and hence upon the roots of the characteristic polynomial A(t) of T (Theorem 9.6). Now
^^t) always factors into linear polynomials over the complex field C, but may not have any
linear polynomials over the real field R. Thus the situation for Euclidean spaces (where
X = R) is inherently different than that for unitary spaces (where K = C); hence we treat
them separately. We investigate Euclidean spaces below, and unitary spaces m the next
section.
Theorem 13.14: Let T be a symmetric (self ad joint) operator on a real finite dimensional
inner product space V. Then there exists an orthonormal basis of V
consisting of eigenvectors of T; that is, T can be represented by a
diagonal matrix relative to an orthonormal basis.
We give the corresponding statement for matrices.
Alternate Form of Theorem 13.14: Let A be a real symmetric matrix. Then there exists
an orthogonal matrix P such that B = P^AP = P*AP
is diagonal.
We can choose the columns of the above matrix P to be normalized orthogonal eigen
vectors of A; then the diagonal entries of B are the corresponding eigenvalues.
CHAP. 13] INNER PRODUCT SPACES 289
/ 2 2\
Example 13.18: Let A = ( j . We find an orthogonal matrix P such that P^AP is diagonal.
The characteristic polynomial A(t) of A is
A(t) = \tIA\ =
t2 2
2 f 5
(«6)(tl)
The eigenvalues of A are 6 and 1. Substitute « = 6 into the matrix tl  A to
obtain the corresponding homogeneous system of linear equations
4x + 2y = 0, 2x + 2/ =
A nonzero solution is v^ = (1, —2). Next substitute t = 1 into the matrix tl ~ A
to find the corresponding homogeneous system
X + 2y = 0, 2x  4y =
A nonzero solution is (2, 1). As expected by Problem 13.31, v^ and V2 are orthogonal.
Normalize Vi and V2 to obtain the orthonormal basis
{«! = (1/V/5, 2/V5), M2 = (2/V5, 1/V5)}
Finally let P be the matrix whose columns are u^ and U2 respectively. Then
/ I/V5 2/V5\ /6
P = and PiAP = PtAP = (
V2/X/5 I/V5/ VO 1
As expected, the diagonal entries of P*AP are the eigenvalues corresponding to the
columns of P.
We observe that the matrix B  P~^AP = P*AP is also congruent to A. Now if q is
a real quadratic form represented by the matrix A, then the above method can be used
to diagonalize q under an orthogonal change of coordinates. This is illustrated in the
next example.
Example 13.19: Find an orthogonal transformation of coordinates virhich diagonalizes the quadratic
form q{x, y) = 2x^ — 4xy + 5y^.
/ 2 2\
The symmetric matrix representing Q is A = I ) . In the preceding
example we obtained the orthogonal matrix ^ '
/ I/V5 2/\/5\ /fi
P = for which PtAP = ( „ "
V2/V/5 1/V5/ Vo 1
(Here 6 and 1 are the eigenvalues of A.) Thus the required orthogonal transforma
tion of coordinates is
x\ /x'\ X = a;7\/5 + 22/'V5
) = P[ ■>,') that is, ^ ^
Under this change of coordinates q is transformed into the diagonal form
q(x',y') = 6x'^ + y'^
Note that the diagonal entries of q are the eigenvalues of A.
An orthogonal operator T need not be symmetric, and so it may not be represented by
a diagonal matrix relative to an orthonormal basis. However, such an operator T does have
a simple canonical representation, as described in the next theorem.
Theorem 13.15: Let T" be an orthogonal operator on a real inner product space V. Then
there is an orthonormal basis with respect to which T has the following
form:
290 INNER PRODUCT SPACES
[CHAP. 13
COS Or — sin Or
sin Or cos Or
The reader may recognize the above 2 by 2 diagonal blocks as representing rotations in
the corresponding twodimensional subspaces.
DIAGONALIZATION AND CANONICAL FORMS IN UNITARY SPACES
We now present the fundamental diagonalization theorem for complex inner product
spaces, i.e. for unitary spaces. Recall that an operator T is said to be normal if it com
mutes with its adjoint, i.e. if TT* = T* T. Analogously, a complex matrix A is said to be
normal if it commutes with its conjugate transpose, i.e. if AA* = A*A.
Example 13.20: Let A = ( . „,„.). Then
\i S + 2iJ\l S2iJ
2 3  3i
3 + 3i 14
A*A = (^ ' V^ M = ^ ' ^"^*
\1 S2iJ\i 3 + 2iJ \3 + 3i 14
Thus A is a normal matrix.
The following theorem applies.
Theorem 13.16: Let T be a normal operator on a complex finite dimensional inner product
space V. Then there exists an orthonormal basis of V consisting of
eigenvectors of T; that is, T can be represented by a diagonal matrix
relative to an orthonormal basis.
We give the corresponding statement for matrices.
Alternate Form of Theorem 13.16: Let A be a normal matrix. Then there exists a uni
tary matrix P such that B — P~^AP = P*AP is diagonal.
The next theorem shows that even nonnormal operators on unitary spaces have a
relatively simple form.
Theorem 13.17: Let T be an arbitrary operator on a complex finite dimensional inner
product space V. Then T can be represented by a triangular matrix
relative to an orthonormal basis of V.
CHAP. 13] INNER PRODUCT SPACES 291
Alternate Form of Theorem 13.17: Let A be an arbitrary complex matrix. Then there
exists a unitary matrix P such that B = p^AP = P*AP is triangular.
SPECTRAL THEOREM
The Spectral Theorem is a reformulation of the diagonalization Theorems 13.14 and 13.16.
Theorem 13.18 (Spectral Theorem): Let T be a normal (symmetric) operator on a com
plex (real) finite dimensional inner product space V. Then there exist
orthogonal projections Ei, . ..,Er on V and scalars Xi, ...,\r such that
(i) T = XiEi + X2E2 + • • • + XrEr
(ii) E1 + E2+ •■■ ^Er ^ I
(iii) EiEj = for i¥= j.
The next example shows the relationship between a diagonal matrix representation and
the corresponding orthogonal projections.
'2
Example 13.21 : Consider a diagonal matrix, say A =   . Let
The reader can verify that the Ei are projections, i.e. fif = Bj, and that
(i) A = 2£7i + 3^2 + 5^3. (ii) ^1 + ^2 + ^3 = I, (iii) E^^ = for i¥= j
Solved Problems
INNER PRODUCTS
13.1. Verify the relation {u, avi + hvi) = d{u, Vi) + b{u, V2).
Using [1 2], [Ii] and then [I2], we find
{u, avi + bv2) = {avi + 6^2, u) = a{Vi,u) + b{V2, u)
= d <«!,«> + 6 {^2, u) = a(u,Vi) + 6<M, V2>
13.2. Verify that the following is an inner product in R^:
{u, V) = XiVi,  xiy2  X2yi + 3x2y2, where u = {xi, X2), v = (2/1, 1/2).
Method 1.
We verify the three axioms of an inner product. Letting w = («i, Zz), we find
au + bw = a{xi, Wj) + H^i, Z2) = (ffl«i + bzi, aX2 + 623)
292 INNER PRODUCT SPACES [CHAP. 13
Thus
{au + bw, V) = {(axi + bzi, ax^ + 622), (2/1, V2))
= {ax^ + bziivi  (axi + 621)2/2  (ax2 + 6x2)2/1 + 3(aa;2 + 623)2/2
= a(a;i2/i  a;iJ/2  a!22/i + 3a;2?/2) + 6(212/1213/2222/1 + 3222/2)
= a(u,v) + b{w,v)
and so axiom [/i] is satisfied. Also,
{v,u) = j/i^i  2/1X2 — 2/2*^1 + %2*2 = «i2/i — a'i2/2 ~ *22/i + 3a;22/2 = {u,v}
and axiom [/2] is satisfied. Finally,
(M,M> = x^ 2x1X2 + 3x1 = Xj  2a;ia;2 + a; + 2x = (x^Xi)^ + 2x1 ~ °
Also, <M, u) — if and only if x^ = 0, a!2 — 0, i.e. m = 0. Hence the last axiom [I^] is satisfied.
Method 2.
We argue via matrices. That is, we can write {u, v) in matrix notation:
1 l\/2/i^
{u,v) = utAv = (a;i,a;2)i , o m
\l 3/\2/2
and so [/i] holds. Since A is symmetric, [/2] holds. Thus we need only show that A is positive
definite. Applying the elementary row operation iZg ~* ^1 + ^2 *iid then the corresponding ele
mentary column operation C2 ^ Ci + C2, we transform A into diagonal form ( j . Thus A
is positive definite and [Ig] holds. ^ '
13.3. Find the norm of 1; = (3, 4) G R'^ with respect to:
(i) the usual inner product, (ii) the inner product in Problem 13.2.
(i) i;2 = {v,v) = <(3,4), (3,4)> = 9 + 16 = 25; hence lvll = 5.
(ii) i;2 = {v,v} = ((3,4), (3,4)> = 9  12  12 + 48 = 33; hence \\v\\ = a/sI.
13.4. Normalize each of the following vectors in Euclidean space R^:
(i)M = (2,l,l), {n)v = {hhi).
(i) Note {u, u) is the sum of the squares of the entries of u; that is, (u, u) — 2^ + 12 + (—1)2 = 6.
Hence divide m by m = V<m. w) = \/6 to obtain the required unit vector:
ul\\u\\ = (2/V6, l/Ve, 1/a/6)
tear" of fractions: 12i; = (6, 8
;he required unit vector is
12v/12i; = (6/\/l09, 8/a/109. 3/\/i09 )
(ii) First multiply v by 12 to "clear" of fractions: 12i; = (6, 8, 3). We have <12v, 12t))
62 + 82 + (—3)2 = 109. Then the required unit vector is
13.5. Let V be the vector space of polynomials with inner product given by {f,g) =
Cf{t)g{t)dt. Let f{t) = t + 2 and g{t) = t^2tZ. Find (i) (f,g) and (ii) /.
(i) {f,9) = r (t + 2W2tS)dt = t*/4  7t2/2  6tT = 37/4
(ii) </,/> = r (t + 2)(t + 2)dt = 19/3 and /i = y/Wj) = VW3
•'n
CHAP. 13] INNER PRODUCT SPACES 293
13.6. Prove Theorem 13.1 (CauchySchwarz): \{u,v}\ ^ \\u\\ \\v\\.
If V = 0, the inequality reduces to — an d hen ce is valid. Now suppose v # 0. Using
zz = z2 (for any complex number z) and {v,u) = {u,v}, we expand m — <m, v)tv2 ^ where t
is any real value:
— m — <M, i;>ti;2 = (u — {u,v)tv, u — (u,v)tv)
= (u, u) — {u, v)t(u, v) — (u, v)t(v, u) + (u, V)(U, V)t^V, V)
= mP  2t\{u,v)\^ + (M,i;)2t2v2
\(u v)\'^
Set t = l/\\v\\^ to find 0^ mP . ' , from which ](m,i>>P ^ m2 HyH^. Taking the square
root of both sides, we obtain the required inequality.
13.7. Prove that the norm in an inner product space satisfies the following axioms:
[2Vi]: \\v\\ ^ 0; and \\v\\ = if and only if v = 0.
[N,\. ]H = fcH.
[N,]: \\u + v\\^\\u\\ + \\v\\.
By [/g], (v,v) — 0; hence \\v\\ = ^/l^v/v) — 0. Furthermore, i; = if and only if {v,v) = Q,
and this holds if and only if v = Q. Thus [Wj] is valid.
We find \\kv\\^ = {kv, kv) = kk(v,v) = fc2 y2. Taking the square root of both sides gives [ATg].
Using the CauchySchwarz inequality, we obtain
I m + 0)112 = {u + v,u + v) — {u,u) + {u,v) + (u,v) + {v,v)
^ «p + 2MHI + IMP = (IMI + IMIP
Taking the square root of both sides yields [A^s]
Remark: [Ng] is frequently called the triangle inequality because ,
if we view m + u as the side of the triangle formed with u and v
(as illustrated on the right), then [Ng\ states that the length of one
side of a triangle is less than or equal to the sum of the lengths
of the other two sides.
ORTHOGONALITY
13.8. Show that if u is orthogonal to v, then every scalar multiple of u is also orthogonal
to V. Find a unit vector orthogonal to Vi = (1, 1, 2) and V2 = (0, 1, 3) in R^.
If {u, V) = then {ku, v) = k{u, v) = fc • = 0, as required. Let w = {x, y, z). We want
= {w,v{) = X + y + 2z and = (w, V2) = y + 3z
Thus we obtain the homogeneous system
X + y + 2z = 0, y + Sz =
Set z = 1 to find y = —S and x = 1; then w = (1,3,1). Normalize w to obtain the required
unit vector w' orthogonal to v^ and ^2 W = w/w = (l/yfTl, — 3/\/ll, l/v/li).
13.9. Let W be the subspace of R^ spanned by u = (1, 2, 3, 1, 2) and v = (2, 4, 7, 2, 1).
Find a basis of the orthogonal complement W^ of W.
We seek all vectors w = (x, y, z, s, t) such that
{w,u) = X + 2y + Sz  s + 2t =
(w,v) = 2x + 4:y + 7z + 2s  t =
Eliminating x from the second equation, we find the equivalent system
X + 2y + 3z  s + 2t =
z + 4s ~ 5t =
The free variables are y, s and t. Set y = —1, s = 0, t = to obtain the solution Wi = (2, —1, 0, 0, 0).
Set y = 0, s1, t = to find the solution Wj = (13,0,4,1,0). Set yQ, 8 = 0, t = l to obtain
the solution W3 = (—17, 0, 5, 0, 1). The set {wj, W2. Ws) is a basis of W .
294 INNER PRODUCT SPACES [CHAP. 13
13.10. Find an orthonormal basis of the subspace W of C^ spanned by Vi — (1, i, 0) and
t;2 = (l,2,li).
Apply the GramSchmidt orthogonalization process. First normalize Vj. We find
\M? = (.'"k'^i) = I'l + i'(i) + 00 = 2 and so vi = V2
Thus Ml = vi/\\vi\\ = (l/V2,i/y/2,Q).
To form W2 = Vj — (v^, Wi>Mi, first compute
<i'2,wi> = <(1,2, l«), (l/\/2,t/\/2, 0)> = l/V22i/V2 = (l2i)/V2
Then ., = (i,2.1_^)^^__,0j  ^_^,^,l^
Next normalize W2 or, equivalently, 2w2 = (1 + 2i, 2 — i, 2 — 2i). We have
2m)i2 = (2m;i,2wi) = (1 + 2t)(l  2i) + (2  1)(2 + 1) + (2  2i)(2 + 2i) = 18
and 2wil = \/l8. Thus the required orthonormal basis of W is
/ 1 i \ 2wi /l + 2i 2t 2 2i
^2 yT^ r ' ii2»iii vig'^'vn
13.11. Prove Lemma 13.3: An orthonormal set {t*i, ...,«,} is linearly independent and, for
any v e V, the vector
yj = V — {V, Ui)Ui — {V, U^Ut — • ■ ■ — {V, Ur)Ur
is orthogonal to each of the im.
Suppose aiMi + • • • + a^u^ = 0. Taking the inner product of both sides with respect to Mi,
= <0, Mi> = (aiMi + • ■ • + a^u^, Ml)
= ai(Mi,Mi> + a2<M2,Mi> + ■•• + 0^<M„Mi>
= Oi • 1 + a2 • + • • ■ + ftr ' = Oi
or Oi = 0. Similarly, for i = 2, . . . , r,
= (0, Mj) = <aiMi + • • • + a^M„ Mj)
= ai<Mi, Mj> + • ■ • + Oi<Mi, Mj) + • • • + ar(Mr. «{> = "i
Accordingly, {mi, . . . , Uj) is linearly independent.
It remains to show that w is orthogonal to each of the Mj. Taking the inner product of w with
respect to ttj,
<«;,Ml> = <V,Ml>  <'y,Mi><Mi,Mi>  <t', M2KM2, Mi>  •■•  (V,M^>(M„Mi>
= {V, Ml)  (V, Ml) • 1  <V, M2) •  • • •  <V, M^> • =
That is, w is orthogonal to Mi. Similarly, for i = 2, . . . ,r,
<W,Mi) = <'U,Mi)  <'U,Ml)<Mi,Mi)  • • •  (■U,Mi)(Mi,Mj)  • • •  (•«, M^XMr, Mj) =
Thus w is orthogonal to Mj for i = 1 r, as claimed.
13.12. Let TF be a subspace of an inner product space V. Show that there is an orthonormal
basis of W which is part of an orthonormal basis of V.
We choose a basis {v^, ...yV^ioiW and extend it to a basis {vj, ...,v^} of V. We then apply
the GramSchmidt orthogonalization process to {vi,...,v^} to obtain an orthonormal basis
{Ml M„} of y where, for i = 1, . . . , w, M; = ai^v^ + •■■ + au^i. Thus 'u^,...,Ur^W and there
fore {mi, . . . , mJ is an orthonormal basis of W.
CHAP. 13] INNER PRODUCT SPACES 295
13.13. Prove Theorem 13.2: Let W' be a subspace of V; then V=W@W^.
By Problem 13.12 there exists an orthonormal basis {ui, . . ., u^} of W which is part of an ortho
normal basis {mi, . . .,m„} of V. Since {mj, . . .,«„} is orthonormal, u^+i, ...,«„£ TF"*. If v e.V,
V = OjMj + • • ■ + ft„M„ where ajVi + ■ ■ • + a^u^ G W, (ir+i«*r + i + ' " • + «««« ^ ^
Accordingly, y = W + W"*.
On the other hand, if wGWnW^, then <w,w> = 0. This yields w = 0; hence WnW'' = {0}.
The two conditions, V =W+W^ and PTn W'"' = {0}, give the desired result V =W ®W^.
Note that we have proved the theorem only for the case that V has finite dimension; we remark
that the theorem also holds for spaces of arbitrary dimension.
13.14. Let W' be a subspace of W. Show that WcW^^^, and that W = W^^ when V
has finite dimension.
Let weW. Then {w,v) = for every vGW^; hence wSTV^^. Accordingly, WcW^^.
Now suppose V has finite dimension. By Theorem 13.2, V — W ® W'^ and, also, V =
W^ ®W^^ . Hence
dim W = dim y  dim W*" and dim TF"^ "^ = dim y  dim W"^
This yields dim TF = dimiy''. But WcW'^' by the above; hence W = W^^, as required.
13.15. Let {ei, ...,€„} be an orthonormal basis of V. Prove:
(i) for any uGV, u = {u, ei)ei 4 (u, 62)62 +•••+(«, en>e„;
(ii) (ttifii + • • • + OnBu, biCi + • • • + 6„e„) = aibi + chbl + • • • + Onhn;
(iii) for any u,v GV, (u, v) = {u, ei){v, ei) + • • • + (u, e„)<v, e„>;
(iv) if T.V^V is linear, then (r(ej), ei) is the i/entry of the matrix A representing
T in the given basis {d}.
(i) Suppose at = /f 1 «! + fcj 62 + • • ■ + fen^n Taking the inner product of u with ej,
= fci<ei, ei> + fc2<e2, ej) + • • • + fc„(e„, ej}
= fcj • 1 + fcj • + • • • + fe„ • = fci
Similarly, for i = 2, . . .,n,
{u, gj) = (fciCi + h fcjej H h fc„e„, ej)
= kiie^, ej> + • • • + fci(ei, 6;) + • • • + fc„<e„, gj)
= fci • + ■ • • + fci • 1 + • • • + fc„ • = fej
Substituting <m, ej) for fej in the equation m = fcie, + • • • + fc„e„, we obtain the desired result.
(ii) We have ( 2 ajee, 2 6jeA = 2 ai6;<ei, e^}
\i=l i=l / i.i = l
But (ej, e^) = for i 7^ j, and (Cj, e^) = 1 for i = j; hence, as required.
2 ajej, 2 fcjBj ) = 2 O'^i = aibi + ajbz + • • • + a„6„
i=l j = l / 1=1
(iii) By (i), u = (m, e,>ei +•••+(«, e„>e„ and i^ = {v, ei>ei + • • • + (i), e„>e„
Then by (ii), {u, v) = (m, e^Xv, e,) + (u, CzKv, 62) + ■ ■ ■ + {u, e„){v, e„>
296 INNER PRODUCT SPACES
(iv) By(i),
r(e,) = <r(ei), ei>ei + {T(e,), e^je^ + ■■■ + {T{e,), e„>e„
[CHAP. 13
T{e„) = iT(e„), e,)ei + {T(e„), 6^)6^ + • • • + (T{e^), e„)e„
The matrix A representing T in the basis {e;} is the transpose of the above matrix of
efficients; hence the ventry of A is (T(ej), ej.
ADJOINTS
13.16. Let T be the linear operator on C^ defined by
T{x, y, z) = (2x + (1  i)y, (3 + 2i)x  4iz, 2ix + (4  Zi)y  Zz)
¥mdiT*{x,y,z).
First find the matrix A representing T in the usual basis of C^ (see Problem 7.3):
/ 2 Xi
A = 3 + 2i 4i
\ 2i 4  3i 3
Form the conjugate transpose A* of A:
/ 2 3  2i 2i
A* = 1 + i 4 + 3t
\ 4i 3
Thus
T*(x, y, z) = {2x + (3  2i)y  2iz, (1 + i)x + (4 + 3i)z, Aiy  3z)
13.17. Prove Theorem 13.5: Let ^ be a linear functional on a finite dimensional inner
product space V. Then there exists a unique % G F such that ^(v) = {v, u) for
every v G.V.
Let {ei, . . . , e„} be an orthonormal basis of V. Set
u = 0(ei)ei + 0(62)62 + • • • + 0(e„)e„
Let M be the linear functional on V defined by u(v) = (v, u), for every v ElV. Then for i = 1, . . . , to,
w(ei) = (ej.M) = ^ej, 0(e7)ei + • • • + 0(ije„> = <p(e,)
Since m and agree on each basis vector, u = <i>.
Now suppose m' is another vector in V for which <i,(v) = (v, u') for every vGV. Then
(V, u) = (V, u') or (V, uu') = 0. In particular this is true for v = uu' and so (u ~u',u u') = 0.
This yields u — u' = and u = u'. Thus such a vector m is unique as claimed.
13.18. Prove Theorem 13.6: Let T be a linear operator on a finite dimensional inner product
space V. Then there exists a unique linear operator T* on V such that {T{u), v) =
{u, T* (v)), for every u,v GV. Moreover, if A is the matrix representing T in an
orthonormal basis {ei} of V, then the conjugate transpose A* of A is the matrix rep
resenting T* in {Ci}.
We first define the mapping T*. Let v be an arbitrary but fixed element of V. The map
u h» (T(u), V) is a linear functional on V. Hence by Theorem 13.5 there exists a unique element
v'&V such that {T{u),v) = {u,v'} for every u&V. We define T*V^V by T*(v) = v'. Then
(T(u), v) = {u, T* (v)) for every u.v&V.
CHAP. 13] INNER PRODUCT SPACES 297
We next show that T* is linear. For any u, V; G V, and any a,b G K,
(u, T*(av^ + hv^)) = {T(u), av^ + bvz) = d{T(u), v^) + b{T{u), v^)
= a(u, r*(vi)) + b{u, T*(V2)} = (u, aT*(Vi) + bT*{v2))
But this is true for every uGV; hence T* {av^ + bv^) = aT*(vi) + bT*(v2). Thus T* is linear.
By Problem 13.15(iv), the matrices A = (ay) and B = (6y) representing T and T* respectively
in the basis {ej are given by Oy = (r(ej), e;) and by = <r*(ej), e^). Hence
6y = <r*(e,.), ej) = {ei, r*(e^)> = {T{ei), e,> = o^i
Thus B = A*, as claimed.
13.19. Prove Theorem 13.7: Let S and T be linear operators on a finite dimensional inner
product space V and let k G K. Then:
(i) (5 + 7)* = 5*42^* (iii) (ST)* = T*S*
(ii) (/cT)* = kT* (iv) (T*)* = T
(i) For any u,v G V,
{{S + T){u), V) = (S(m) + T{u), V) = {S{u), V) + {T(u), v) = (u, S*{v)> + (u, T*{v))
= {u, S*{v) + T*(v)) = (u, (S* + T*){v)}
The uniqueness of the adjoint implies (S + T)* = S* + T*.
(ii) For any u,v G V,
{(kT){u), V) = (kT{u), V) = k(T(u), v) = k{u, T*{v)) = (u, kT*iv)) = {u, (kT*)(v))
The uniqueness of the adjoint implies {kT)* = kT*.
(iii) For any u,v G V,
{(ST)(u),v) = {S(T{u)),v} = {T(u),S*{v)) ^ {u, T*(S*(v))) = (m, (r*S*)(i;)>
The uniqueness of the adjoint implies (ST)* = T*S*.
(iv) For any u,vGV, {T*(u),v) = {v, T*{u)) = {T(v),u) = {u, T(v))
The uniqueness of the adjoint implies (T*)* = T.
13.20. Show that: (i) /* = 7; (ii) 0* = 0; (iii) if T is invertible, then (Ti)* = T*\
(1) For every u,vGV, {I(u),v} = {u,v} = {u,I{v)}; hence I* = I.
(ii) For every u,vGV, <0(M),'y) = {0,v) = = (u,0)  {u,0(v)); hence 0* = 0.
(iii) 7 = /* = (TTi)* = (ri)*r*; hence {T'^)* = T*K
13.21. Let r be a linear operator on V, and let W he a Tinvariant subspace of V. Show
that W is invariant under T*.
Let uGW^. If wGW, then T{w) G W and so (w, T*(u)) = (Tiw), u)  0. Thus T*(ii) GW^
since it is orthogonal to every w GW . Hence W is invariant under T*.
13.22. Let r be a linear operator on V. Show that each of the following conditions implies
r = 0:
(i) (r(M), i;) = for every u,v GY;
(ii) F is a complex space, and (r(M),«) = for every m G F;
(iii) T is self adjoint and (r(tt),%) = for every « G F.
Give an example of an operator T on a real space V for which (r(w), m> = for
every m e F but T ^ 0.
(i) Set V  T{u). Then {T{u), T{u)} = and hence T(u) = 0, for every uGV. Accordingly,
r = o.
298 INNER PRODUCT SPACES [CHAP. 13
(ii) By hypothesis, (T(v + w), v + w) = for any v.wGiV. Expanding and setting {T{v),v} =
and (T{w), w) = 0,
{T(v), w) + (T(w), V) = Q (1)
Note w is arbitrary in (1). Substituting iw for w, and using {T(v), iw) = i(T{,v), w) =
i(T(v),w) and {T{iw),v)  {iT(w),v) = i{T(w),v),
i{T(v), w) + i{T(w), V) = a
Dividing through by i and adding to (1), we obtain {T(w), v) = Q for any v.wGV. By (1)
T Q.
(iii) By (ii), the result holds for the complex case; hence we need only consider the real case.
Expanding {T(v + w),v\w) = 0, we again obtain (1). Since T is selfadjoint and since it is
a real space, we have (T(w),v) = (w,T(v)) = {T(v),w). Substituting this into (1), we obtain
{T(v), w) = Q for any v,w&V. By (i), T  Q.
For our example, consider the linear operator T on R2 defined by T(x, y) = (y, ~x). Then
{T{u), u) = for every m G V, but T ¥^ 0.
ORTHOGONAL AND UNITARY OPERATORS AND MATRICES
13,23. Prove Theorem 13.9: The following conditions on an operator U are equivalent:
(i)C/* = C/i; (ii) {Uiv),Uiw)} = {v,w}, for every v.wGV; (iii) C7(v) = t;, for
every v &V.
Suppose (i) holds. Then, for every v,w &V,
{Uiv),U(w)) = {v,U*U{w)) = {v,I(w))  {v,vo)
Thus (i) implies (ii). Now if (ii) holds, then
C7(i;) = V<t/(v), V(.v)) = V(^> = \\v\\
Hence (ii) implies (iii). It remains to show that (iii) implies (i).
Suppose (iii) holds. Then for every i; S V,
(U*U{v), V) = {U{v), V(v)) = {V, v) = {I{v), V)
Hence ((U*V  I)(v),v) = for every ve.V. But t7*t7 / is self adjoint (Prove!); then by Prob
lem 13.22 we have V*UIQ and so U*U = I. Thus 17* = t7i as claimed.
13.24. Let C7 be a unitary (orthogonal) operator on V, and let W be a subspace invariant
under U. Show that W^ is also invariant under U.
Since U is nonsingular, U{W) — W; that is, for any w G W there exists w' G TV such that
U{w') — w. Now let V G W^. Then for any w &W,
(U(y), w) = (U(v), V(w'))  {V, w') =
Thus U{v) belongs to W . Therefore W is invariant under U.
13.25. Let A be a matrix with rows Ri and columns d. Show that: (i) the i;entry of A A*
is {Ri, Rj); (ii) the yentry of A* A is {d, d).
If A = (ftjj), then A* = (6y) where 6y = a^. Thus AA* = (cy) where
n n
fc— 1 K — 1
= <(«(!, . . . , ai„), {aji, ..., UjJ) = (Ri, Rj)
as required. Also, A*A = (dy) where
n n
<iii = 2 &ifc«icj = 2 «kj»fci = oiflii + »2j^ !•■■+ a„M^
k = l fc = l
= ((«ij, . . . , a^j), (an, . . . , a„i)) = {Cj, C)
CHAP. 13] INNER PRODUCT SPACES 299
13.26. Prove Theorem 13.11: The following conditions for a matrix A are equivalent:
(i) A is unitary (orthogonal), (ii) The rows of A form an orthonormal set. (iii) The
columns of A form an orthonormal set.
Let fij and Cj denote the rows and columns of A, respectively. By the preceding problem,
AA* = (cy) where Cy = (Ri,«j>. Thus AA* = I if and only if <fli,fij) = Sy. That is, (i) is
equivalent to (ii).
Also, by the preceding problem. A* A = (dy) where dy  (Cj, Q. Thus A* A = / if and only
if (Cj, Cj) = 8y. That is, (i) is equivalent to (iii).
Remark: Since (ii) and (iii) are equivalent, A is unitary (orthogonal) if and only if the transpose
of A is unitary (orthogonal).
13.27. Find an orthogonal matrix A whose first row is Ui = (1/3, 2/3, 2/3).
First find a nonzero vector W2 = {x, y, z) which is orthogonal to Mj, i.e. for which
= (Ml, W2> = x/3 + 2y/B + 2z/3 = or x + 2y + 2z =
One such solution is Wg — (0, 1, —1). Normalize W2 to obtain the second row of A, i.e.
M2=(0,l/vi,l/\/2).
Next find a nonzero vector Wg — {x, y, z) which is orthogonal to both Mj and «£, i.e. for which
= <Mi, W3) = xlZ + 2ylZ + 2«/3 =0 or x \ 2y + 2z = Q
— (U2, Wz) — Vlyf2 — zlyl2 = or 2/— z =
Set « = — 1 and find the solution Wj = (4, —1, —1). Normalize W3 and obtain the third row of A,
i.e. M3 = (4/Vl8, l/VlS, l/\/l8). Thus
/ 1/3 2/3 2/3
A = 1/a/2 1/\/2
\4/3\/2 l/3\/2 l/3\/2y
We emphasize that the above matrix A is not unique.
13.28. Prove Theorem 13.12: Let {ei, . . . , e„} be an orthonormal basis of an inner product
space V. Then the transition matrix from {Ci} into another orthonormal basis is
unitary (orthogonal). Conversely, if P = (ao) is a unitary (orthogonal) matrix, then
the following is an orthonormal basis:
{e'i = auBi + 02162 + • • • + a„ie„ : i = 1, . . . , w}
Suppose {/j} is another orthonormal basis and suppose
/i = ^ilBl + 61262 +•••+ fein^n. 1=1, ...,n U)
By Problem 13.15 and since {/j} is orthonormal,
Sy = (fufj) = biibfi + 6426^ + • • • + bi„i;~ (2)
Let B = (6y) be the matrix of coefiicients in (1). (Then B* is the transition matrix from {ej} to
{/J.) By Problem 13.25, BB* = (cy) where Cy = 6ii67i + ^12^ + • • • + K^n By (2). "n = ^H
and therefore BB* = /. Accordingly B, and hence B*, are unitary.
It remains to prove that {«,'} is orthonormal. By Problem 13.15,
(ei e'j) = auo^ + a^^j + • • • + a„ia;;j = (Cj, Cj)
where Cj denotes the ith column of the unitary (orthogonal) matrix P = (ay). By Theorem 13.11,
the columns of P are orthonormal; hence (e[, ej) = <C{, Cj) = 8y. Thus {e[} is an orthonormal basis.
13.29. Suppose A is orthogonal. Show that det(A) = 1 or —1.
Since A is orthogonal, A A* = /. Using \A\ = \A*\,
1 = 1/ = lAAt = \A\\At\ = Ap
Therefore lAI = 1 or — 1.
300 INNER PRODUCT SPACES [CHAP. 13
13.30. Show that every 2 by 2 orthogonal matrix A for which det{A) = 1 is of the form
/cos 6 ~ sin 9\ ^ , ,
. „ „ for some real number 6.
y sm 6 cos 9 j
/a b\
Suppose A = { ) . Since A is orthogonal, its rows form an orthonormal set; hence
^c d J
„2 + 62 = x^ c2 + d2 = 1, ac+hd = 0, ad  be = 1
The last equation follows from det(A) = 1. We consider separately the cases a. = and a¥'0.
If a = 0, the first equation gives 6^ = 1 and therefore b = ±1. Then the fourth equation
gives c = —b = ^:l, and the second equation yields 1 + d^ = i or d = 0. Thus
^ = (: I) " c I
The first alternate has the required form with e — — r/2, and the second alternate has the required
form with e = ttI2.
If a 7^ 0, the third equation can be solved to give c =^ —bd/a. Substituting this into the
second equation,
62d2/a2 + (£2 =: 1 or bU^ + a2d2 = a2 or (62 + aP')d2 = a^ or a2 = d2
and therefore a = d or a — —d. If a = —d, then the third equation yields c = b and so the
fourth equation gives —a^ — c2 = 1 which is impossible. Thus a = d. But then the third equa
tion gives b — —c and so
/a — c^
A =
Since a^ + c^ = 1, there is a real number 9 such that a = cos e, c — sin « and hence A has the
required form in this case also.
SYMMETRIC OPERATORS AND CANONICAL FORMS IN EUCLIDEAN SPACES
13.31. Let r be a symmetric operator. Show that: (i) the characteristic polynomial A{t) of
r is a product of linear polynomials (over R); (ii) T has a nonzero eigenvector;
(iii) eigenvectors of T belonging to distinct eigenvalues are orthogonal.
(i) Let A be a matrix representing T relative to an orthonormal basis of V; then A — A*. Let
A(t) be the characteristic polynomial of A. Viewing A as a complex self adjoint operator, A
has only real eigenvalues by Theorem 13.8. Thus
A(«) = (iXi)(tX2)(«X„)
where the Xj are all real. In other words, A(t) is a product of linear polynomials over B.
(ii) By (i), T has at least one (real) eigenvalue. Hence T has a nonzero eigenvector.
(iii) Suppose T(v) = \v and T(w) = nw where \ ¥= /i. We show that X('U, w) = ii{v, w):
\{v,w} = {\v,w} = {T{v),w) = {v,T{w)) = {v,nw) = ti{v,w)
But \¥' n\ hence (v, w) = as claimed.
13.32. Prove Theorem 18.14: Let T be a symmetric operator on a real inner product space
V. Then there exists an orthonormal basis of V consisting of eigenvectors of T;
that is, T can be represented by a diagonal matrix relative to an orthonormal basis.
The proof is by induction on the dimension of V. If dimV = 1, the theorem trivially holds.
Now suppose dim F = n > 1. By the preceding problem, there exists a nonzero eigenvector v^ of
T. Let W be the space spanned by v^, and let Mj be a unit vector in W, e.g. let Mj = i'i/vi.
CHAP. 13] INNER PRODUCT SPACES 301
Since v^ is an eigenvector of T, the subspace TF of y is invariant under T. By Problem 13.21,
W^ is invariant under T* = T. Thus the restriction T of T to W^ is a symmetric operator. By
Theorem 13.2, V =W ®W^. Hence dim TF"* = m  1 since dim W^ = 1. By induction, there
exists an orthonormal basis {u^, . . . , m„} of W^ consisting of eigenvectors of T and hence of T. But
<Mi, Wj> = for t = 2, . . . , m because Mj G PF* . Accordingly {%, %,...,«„} is an orthonormal set
and consists of eigenvectors of T. Thus the theorem is proved.
13^3. Let A = ( 2 ^ ) . Find a (real) orthogonal matrix P for which P^AP is diagonal.
The characteristic polynomial A(t) of A is
A(t) = f/A =
t 1 2
2 t 1
= {2  2t  3 = (t  3)(t + 1)
and thus the eigenvalues of A are 3 and —1. Substitute t = 3 into the matrix tl — A to obtain the
corresponding homogeneous system of linear equations
2x2y  0, 2x + 2j/ =
A nonzero solution is Vi — (1, 1). Normalize v^ to find the unit solution Mi = (ll\2, l/v2).
Next substitute t — —l into the matrix tl — A to obtain the corresponding homogeneous system
of linear equations
2x  2j/ = 0, 2a;  2^/ =
A nonzero solution is v^ = (1, —1). Normalize V2 to find the unit solution u^ = (1/a/2, —1/^/2).
Finally let P be the matrix whose columns are Mj and Mj respectively; then
As expected, the diagonal entries of P*AP are the eigenvalues of A.
13.34. Let A — \1 2 1 . Find a (real) orthogonal matrix P for which P'^AP is diagonal.
\l 1 2/
First find the characteristic polynomial A{t) of A:
t  2 1 1
1 t  2 1
1 1 t  2
A(t) = \tIA\ =
= (tl)2(t4)
Thus the eigenvalues of A are 1 (with multiplicity two) and 4 (with multiplicity one). Substitute
t = 1 into the matrix tl — A to obtain the corresponding homogeneous system
—X — J/ — 2 = 0, —X — y — z = 0, —X — y — z =
That is, X + y + z = 0. The system has two independent solutions. One such solution is Vi =
(1, —1, 0). We seek a second solution V2 = (a, 6, c) which is also orthogonal to v^; that is, such that
a+ b + c = and also a — 6 =
For example, Vj = (1, 1, —2). Next we normalize f j and V2 to obtain the unit orthogonal solutions
Ml = (l/\/2, I/V2, 0), Ma = (l/\/6, l/\/6, 2/V^)
Now substitute t = 4 into the matrix tl — A to find the corresponding homogeneous system
2x — y — z = 0, X + 2y  z = 0, x  y + 2z =
Find a nonzero solution such as t^s = (1, 1, 1), and normalize v^ to obtain the unit solution
M3 = (l/v3> l/v3, l/vS). Finally, if P is the matrix whose columns are the Wj respectively.
302
INNER PRODUCT SPACES
[CHAP. 13
i/\/2 i/Ve i/VsX
P = I I/V2 l/Ve l/Vs and PtAP
2/\/6 I/V3/
13.35. Find an orthogonal change of coordinates which diagonalizes the real quadratic form
q(x, y) = 2x^ + 2xy + 2y^.
First find the symmetric matrix A representing q and then its characteristic polynomial A(t):
'2 1
A =
1 2
and A{t) = 1*7 A =
t 2 1
1 t  2
{tl){t3)
The eigenvalues of A are 1 and 3; hence the diagonal form of q is
q(x', y') = x'^ + Zy'^
We find the corresponding transformation of coordinates by obtaining a corresponding orthonormal
set of eigenvectors of A.
Set f = 1 into the matrix tl — A to obtain the corresponding homogeneous system
—X — y = 0, —X — y —
A nonzero solution is v^ = (1,1). Now set i = 3 into the matrix tl — A to find the corresponding
homogeneous system
X — y — 0, —X + y =
A nonzero solution is V2 = (1, 1). As expected by Problem 13.31, v^ and V2 are orthogonal. Normalize
Vi and V2 to obtain the orthonormal basis
{ui = (l/\/2, l/\/2), M2 = (l/\/2, I/V2)}
The transition matrix P and the required transformation of coordinates follow:
P =
l/^/2 l/\/2
l/\/2 l/\/2
and
= P
(x' + y')/V2
(x' + y')/^/2
Note that the columns of P are Mj and 1*2 We can also express x' and y' in terms of x and j/ by
using P^i = P'; that is,
x' = {xy)/V2, y' = (« + j/)/\/2
13.36. Prove Theorem 13.15: Let T be an orthogonal operator on a real inner product space
V. Then there is an orthonormal basis with respect to which T has the following
form:
1 I
_j
I 1
1
1
^ 1
1 cos Oi — sm di I
I sin di cos 01
I cos 9r — sm 9r
I sin 9r cos 6r
CHAP. 13] INNER PRODUCT SPACES 303
Let S = r + ri = T + T*. Then S* = (T + T*)* = T* + T = S. Thus S Is a symmetric
operator on V. By Theorem 13.14, there exists an orthonormal basis of V consisting of eigenvectors
of S. If Xi, . . . , Xjn denote the distinct eigenvalues of S, then V can be decomposed into the direct
sum y = Vi Va © • • • © Vm where the Vj consists of the eigenvectors of S belonging to Xj. We
claim that each Vj is invariant under T. For suppose v e Vj; then S{v) — \v and
S(T(v)) = (T+T^)T(v) = T(T+T^){v) = TS{v) = TiXfV) = \iT(v)
That is, T{v) & Fj. Hence Vi is invariant under T. Since the V; are orthogonal to each other, we
can restrict our investigation to the way that T acts on each individual 'V^.
On a given V;, (T + T^)v = S(v) = \v. Multiplying by T,
(T2\T + I){v) =
We consider the cases Xj = ±2 and X; ¥= ±2 separately. If Xj = ±2, then (T ± I)Hv)  which
leads to (T ± I){v) = or T(v) = ±v. Thus T restricted to this Fj is either I or /.
If Xj ¥= ±2, then T has no eigenvectors in Vj since by Theorem 13.8 the only eigenvalues of T
are 1 or —1. Accordingly, for v ¥= the vectors v and T{v) are linearly independent. Let W be
the subspace spanned by v and T(v). Then W is invariant under T, since
T(T(v)) = T^v) = \^T(v)  v
By Theorem 13.2, Vj = W © W^ . Furthermore, by Problem 13.24 w'' is also invariant under T.
Thus we can decompose V^ into the direct sum of two dimensional subspaces Wj where the Wj are
orthogonal to each other and each Wj is invariant under T. Thus we can now restrict our investiga
tion to the way T acts on each individual Wj.
Since T^ — XjT + / = 0, the characteristic polynomial A(t) of T acting on Wj is A(t) =
t^ — \t + 1. Thus the determinant of T is 1, the constant term in A(t). By Problem 13.30, the
matrix A representing T acting on Wj relative to any orthonormal basis of Wj must be of the form
'cos e — sin e^
^ sin e cos e y
The union of the basis of the Wj gives an orthonormal basis of Vj, and the union of the basis of the
Vj gives an orthonormal basis of V in which the matrix representing T is of the desired form.
NORMAL OPERATORS AND CANONICAL FORMS IN UNITARY SPACES
13.37. Determine which matrix is normal: (i) A = / ^ *■ ) , (ii) B = I , 2 + •
<»  = (;:)(;:) = (:;)  g OG o = (' ^
Since AA* ¥= A*A, the matrix A is not normal.
^"' 1^1 2 + iJ\i 2iJ [22i 6
\i 2iJ\l 2 + iJ \22i 6
Since BB* = B*B, the matrix B is normal.
13.38. Let T be a normal operator. Prove:
(i) Tiv) = if and only if T*{v) = 0.
(ii) T — \I is normal.
(iii) If T{v) = \v, then T*{v) = Xv; hence any eigenvector of T is also an eigen
vector of T*.
(iv) If T{v) = Xiv and T{w) = X2W where A,i ^ A2, then {v,w) = 0; that is, eigen
vectors of T belonging to distinct eigenvalues are orthonormal.
304 INNER PRODUCT SPACES [CHAP. 13
(i) We show that {T(v), T{v)) = {T*(v), T*iv)):
(T(v), T(v)) = (V, T*T{v)) = {V, TTHv)) = {T{v), T'(v))
Hence by [/g], T(v) = if and only if T*(v) = 0.
(ii) We show that T — \I commutes with its adjoint:
{T  \i)(,T  \i)* = (rx7)(r*x/) = rr*  XT*  xr + xx/
_ T*T XT  XT* + XXI = {T* Xl){T  XI)
= (T  XI)*{T  XI)
Thus T ~\I is normal.
(iii) If T(v) = XV, then (T  Xl){v) = 0. Now T  Xl is normal by (ii); therefore, by (i),
(rX/)*(i>) = 0. That is, [T* Xl)(v) = 0; hence T*(v) Xv.
(iv) We show that Xi{v, w) = X2<v, w):
Xi{v,w} = (Xiv.w) = {T(v),w) = {v,T*(w))  {v.X^w) = X<i_{v,w)
But Xj ¥= X2; hence {v, w) = 0.
13.39. Prove Theorem 13.16: Let T be a normal operator on a complex finite dimensional
inner product space V. Then there exists an orthonormal basis of V consisting of
eigenvectors of T; that is, T can be represented by a diagonal matrix relative to an
orthonormal basis.
The proof is by induction on the dimension of V. If dim V = 1, then the theorem trivially
holds. Now suppose dim V — n> \. Since V is a complex vector space, T has at least one eigen
value and hence a nonzero eigenvector v. Let W be the subspace of V spanned by v and let u^ be a
unit vector in W.
Since v is an eigenvector of T, the subspace W is invariant under T. However, v is also an
eigenvector of T* by the preceding problem; hence W is also invariant under T*. By Problem 13.21,
W is invariant under T** = T. The remainder of the proof is identical with the latter part of
the proof of Theorem 13.14 (Problem 13.32).
13.40. Prove Theorem 13.17: Let T be an arbitrary operator on a complex finite dimensional
inner product space V. Then T can be represented by a triangular matrix relative
to an orthonormal basis {Ui, U2, . . ., Wn} ; that is, for i = l, . . .,n,
T{ui) — OiiMi + ai2U2 + • • • + aiiUi
The proof is by induction on the dimension of V. If dim V = 1, then the theorem trivially
holds. Now suppose dim V = n > 1. Since V is a complex vector space, T has at least one eigen
value and hence at least one nonzero eigenvector v. Let W be the subspace of V spanned by v and
let Ml be a unit vector in W. Then itj is an eigenvector of T and, say, T{ui) = a^Ui.
By Theorem 13.2, V =W ®W^. Let E denote the orthogonal projection of V into W''' .
Clearly W is invariant under the operator ET. By induction, there exists an orthonormal basis
{^2, . . . , M„} of W such that, for i = 2, . . .,n,
ET{Ui) = OJ2M2 + ttiaMg + • • ■ + djiitj
(Note that {ui,U2, ...,u„} is an orthonormal basis of V.) But E is the orthogonal projection of V
onto W ; hence we must have
T(Ui) = ajiMi + ai2M2 + • ■ • + fliiM;
for i = 2,...,n. This with T(ui) = a^Ui gives us the desired result.
CHAP. 13] INNER PRODUCT SPACES 305
MISCELLANEOUS PROBLEMS
13.41. Prove Theorem 13.13A: The following conditions on an operator P are equivalent:
(i) P = T^ for some self ad joint operator T.
(ii) P — S*S for some operator S.
(iii) P is self adjoint and {P(u), u}^ for every u GV.
Suppose (i) holds, that is, P = T^ where T = T*. Then P = TT = T*T and so (i) implies
(ii). Now suppose (ii) holds. Then P* = (S*S)* = S*S** = S*S = P and so P is selfadjoint.
Furthermore,
{P(u),u} = {S*S(u),u) = (S(m), S(m)> ^
Thus (ii) implies (iii), and so it remains to prove that (iii) implies (i).
Now suppose (iii) holds. Since P is self adjoint, there exists an orthonormal basis {u^, . . ., m„}
of V consisting of eigenvectors of P; say, P^.u^) = XiMj. By Theorem 13.8, the Xj are real. Using
(iii), we show that the Xj are nonnegative. We have, for each i,
 <P(Mi), Mj) = (XjMj, Mi) = \(Ui, Mi>
Thus <Mi, Mj)  forces Xj  0, as claimed. Accordingly, ^/\^ is a real number. Let T be the
linear operator defined by
r(Mj) = VXiMi, for t=l, ...,n
Since T is represented by a real diagonal matrix relative to the orthonormal basis {u^, T is self
adjoint. Moreover, for each i,
T^Ui) = r(vTiMi) = V\iT{ut) = V^^^/\^u^ = \{u.i = P(ud
Since T^ and P agree on a basis of y, P = T^. Thus the theorem is proved.
Remark: The above operator T is the unique positive operator such that P = T^ (Problem
13.93); it is called the positive square root of P.
13.42. Show that any operator T is the sum of a selfadjoint operator and skewadjoint
operator.
Set S^^{T+T*) and U = ^(TT*). Then T = S+U where
S* = {^{T+T*))* := :^(T* + T**) = 1{T* + T) = S
and U* = {^(TT*))* =; ^(T*  T) = ^(T  T*) = U
i.e. S is selfadjoint and U is skew adjoint.
13.43. Prove: Let T be an arbitrary linear operator on a finite dimensional inner product
space V. Then T is a product of a unitary (orthogonal) operator U and a unique
positive operator P, that is, T = UP. Furthermore, if T is invertible, then U is also
uniquely determined.
By Theorem 13.13, r*r is a positive operator and hence there exists a (unique) positive operator
P such that P2 = T*T (Problem 13.93). Observe that
P(t;)2 = {P{v),P(v)} = (PHv),v} = {T*T(v),v} = {T(v), T(v)) = \\T{v)\\2 (i)
We now consider separately the cases when T is invertible and noninvertible.
If T is invertible, then we set U = Pri. We show that U is unitary:
U*  (Pri)*  Ti*P* = (r*)ip and U*U = (T*)^PPTi = (T*)^T*TT^ = I
Thus U is unitary. We next set U =UK Then U is also unitary and T = UP as required.
To prove uniqueness, we assume T = UqP^ where U^ is unitary and Pg is positive. Then
T*T = PtutUaPo = PoIPo = ^o
But the positive square root of T*T is unique (Problem 13.93); hence Pq = P. (Note that the
invertibility of T is not used to prove the uniqueness of P.) Now if T is invertible, then P is also
by (1). Multiplying U^P = UP on the right by Pi yields Uo = U. Thus U is also unique when
T is invertible.
306 INNER PRODUCT SPACES [CHAP. 13
Now suppose T is not invertible. Let W be the image of P, i.e. W = ImP. We define
Ui.W^V by
Ui(w)  T(v) where P(v) = w (2)
We must show that Ui is well defined, that is, that P(v)  P{v') implies T(v) = T(v'). This follows
from the fact that P{v — v') = is equivalent to P(d — i;')li = which forces r(i) — v')l =
by (1). Thus U^ is well defined. We next define UiW^V. Note by (1) that P and T have the
same kernels. Hence the images of P and T have the same dimension, i.e. dim (Im P) — dim W =
dim(Imr). Consequently, W and (Im 7^' also have the same dimension. We let C/j be any
isomorphism between W^ and (Im T) .
We next set U = UiQ U^. (Here U is defined as follows: it v GV and v — w + w' where
weW.w'eW^, then U(v) = Ui(w) + U^iw').) Now U is linear (Problem 13.121) and, if v G V
and P(v) — w, then by (2)
T(v) = Ui(w) = U(w) = UPiv)
Thus T = UP as required.
It remains to show that U is unitary. Now every vector x G V can be written in the form
X = P(v) + w' where w'GPT^. Then U{x) = UPiv) + U2{w') = ^(^;) + ?72(w') where <7'(i;),
UiiW)) = by definition of U^. Also, <r(i;), T{v))  {P(v),P(v)) by (1). Thus
<?7(a;), U(x)) = {T{v) + U^iW), T(v) + U^(w'))
= {T(v), T(v)) + (Ui(w'), U^iw'))
= {P(v), P{v)} + <w', w') = <P(i;) + w', P{v) + w')
= (X, X)
(We also used the fact that (P{v),w') = 0.) Thus U is unitary and the theorem is proved.
13.44. Let (tti, a2, . . . ) and (6i, &2, . . . ) be any pair of points in i2space of Example 13.5.
00
Show that the sum ^(hhi = aibi + azbz + • • • converges absolutely.
By Problem 1.16 (CauchySchwarz inequality),
K6,l + ••• + KM ^ A 2 «i J 2 6?  \2«fA S^i
which holds for every n. Thus the (monotonic) sequence of sums S„ = l^i^il + • • • + la„6„ is
bounded, and therefore converges. Hence the infinite sum converges absolutely.
13.45. Let V be the vector space of polynomials over R with inner product defined by
</, fl') = I f{t) g{t) dt. Give an example of a linear functional ^ on V for which
Theorem 13.5 does not hold, i.e. there does not exist a polynomial h{t) for which
<l>{f) = <f,h) for every feV.
Let ^ : y > R be defined by ^(/) = /(O), that is, ^ evaluates f(t) at and hence maps f{t) into
its constant term. Suppose a polynomial h(t) exists for which
Pif) = m  C f{t)h(t)dt (1)
for every polynomial f(t). Observe that ^ maps the polynomial tf(t) into 0; hence by (1),
1
tf(t)h(t)dt = (2)
'0
for every polynomial f{t). In particular, (2) must hold for f(t) = th{t), that is.
C.m^t)dt =
This integral forces h(t) to be the zero polynomial; hence 4>(f) = (f.h) = </,0) = for every poly
nomial f(t). This contradicts the fact that <f> is not the zero functional; hence the polynomial h(t)
does not exist.
CHAP. 13] INNER PRODUCT SPACES 307
Supplementary Problems
INNER PRODUCTS
13.46. Verify that
(ajMj + a^u^, fei'Wi + h^^ = a^^ifi^, v{) + a.i62(Wi, v^) + a^^{u^, v^ + ^^^{u^, v^
More generally, prove that
I m n \
\ i=X i=l / i,i
13.47. Let u = (xi, x^) and v = (j/j, y^) belong to R2.
(i) Verify that the following is an inner product on R^:
f(u, V) = XiVi  2XiV2  ZxzVi + 5x2y2
(ii) For what values of k is the following an inner product on R2?
f(u, v) = XiVi  BxiVz  SXiVi + kx2y2
(iii) For what values of a, 6, c,d £ R is the following an inner product on R^?
f{u, v)  a^ij/i + 6*12/2 + ca;23/i + da;2J/2
13.48. Find the norm of v = (1, 2) € R2 with respect to (i) the usual inner product, (ii) the inner
product in Problem 13.47(i).
13.49. Let u = (zi, Z2) and v = (w^, W2) belong to C^.
(i) Verify that the following is an inner product on C^:
f{u, V) = ZiWi + (1 + i)ZiW2 + (1 — i)22'"'l + SZ2W2
(ii) For what values of a, b, c, d e C is the following an inner product on C^?
f(u, v) = aziWi + bziW2 + CZ2W1 + dz2W2
13.50. Find the norm of v = (l — 2i,2 + Si) G C^ with respect to (i) the usual inner product, (ii) the
inner product in Problem 13.49(i).
13.51. Show that the distance function d(u,v) = v — m, where u,v&V, satisfies the following axiom
of a metric space:
[I>j] d{u,v) — 0; and d(u,v) = if and only if u = v.
[D2] d(u, v) = d{v, u).
[Dg] d{u, v) — d{u, w) + d{w, v).
13.52. Verify the Parallelogram Law: Hm + ijII + M'y = 2m + 2d.
13.53. Verify the following polar forms for (u, v):
(i) {u,v) == l\\u + v\\2l\\uv\\2 (real case);
(ii) (u,v} = J^M + a'2Mi)2+iM + i'u2iMw2 (complex case).
13.54. Let V be the vector space of m X ti matrices over R. Show that {A,B) = tr(B'A) defines an inner
product in V.
13.55. Let V be the vector space of polynomials over R. Show that {f,g)= I f{t) g(t) dt defines an
inner product in V.
13.56. Find the norm of each of the following vectors:
(i) u = (,Jt,^,^)GR4,
(ii) V = (1  2t, 3 + i, 2  5i) G C^,
(iii) /(() = t2 _ 2t + 3 in the space of Problem 13.55,
(iv) A — [ ) in the space of Problem 13.54.
V3 4/
308 INNER PRODUCT SPACES [CHAP. 13
13.57. Show that: (i) the sum of two inner products is an inner product; (ii) a positive multiple of an
inner product is an inner product.
13.58. Let a, 6, c e R be such that at^ + bt + c for every t e R. Show that 62 _ 4^0 ^ 0. Use this
result to prove the CauchySchwarz inequality for real inner product spaces by expanding
tM + i;2 ^ 0.
13.59. Suppose <m, •«>! = m H^yll. (That is, the CauchySchwarz inequality reduces to an equality.) Show
that u and v are linearly independent.
13.60. Find the cosine of the angle e between u and v if:
(i) u = (1, 3, 2), V = (2, 1, 5) in RS;
(ii) u — 2t — l, V = t^ in the space of Problem 13.55;
/2 1\ /O 1\
(ill) M=(_l,v = ( ) in the space of Problem 13.54.
ORTHOGONALITY
13.61. Find a basis of the subspace W of R* orthogonal to u^ = (1, —2,3,4) and % = (3. —5, 7, 8).
13.62. Find an orthonormal basis for the subspace W of C^ spanned by Wj = (1, i, 1) and % = (1 + i, 0, 2).
13.63. Let V be the vector space of polynomials over R of degree — 2 with inner product </, g) =
Cf{t)g{t)dt.
(i) Find a basis of the subspace W orthogonal to h{t) = 2t + 1.
(ii) Apply the GramSchmidt orthogonalization process to the basis {1, t, i^} to obtain an ortho
normal basis {%(*), U2(t), u^(t)} of V.
13.64. Let y be the vector space of 2 X 2 matrices over R with inner product defined by {A,B) — tr(B*A).
(i) Show that the following is an orthonormal basis of V:
'1 0\ /O 1\ /O 0\ /O ON
,0 07' Vo o; (,0 \)' Vo \,
(ii) Find a basis for the orthogonal complement of (a) the diagonal matrices, (6) the symmetric
matrices.
13.65. Let If be a subset (not necessarily subspace) of V. Prove: (i) W = •E'(PF); (ii) if V has finite
dimension, then W — I'(W). (Here UyV) is the space spanned by W.)
13.66. Let W be the subspace spanned by a nonzero vector w in Y, and let E be the orthogonal projection
{v, w)
of V onto W. Prove B(d) = tj — t w. We call E(v) the projection of v along w.
\\w\\^
13.67. Find the projection of v along w if:
(i) V = (1, 1, 2), w = (0, 1, 1) in R3;
(ii) V (li,2 + 3i),w = (2~i,S) in C2;
(iii) V = 2t — l, w = t^ in the space of Problem 13.55;
/I 2\ /O 1\
(iv) v = I q)''*'~(i 9)^" *^® space of Problem 13.54.
13.68. Suppose {mj, . . .,mJ is a basis of a subspace W of V where dim V = n. Let {vi, . . .,i;„_,} be an
independent set of n — r vectors such that (Uj, Vj) = for each i and each j. Show that
{■^1, . . .,a>„_r} is a basis of the orthogonal complement W .
CHAP. 13] INNER PRODUCT SPACES 309
13.69. Suppose {mi, . ..,u^) is an orthonormal basis for a subspace W of V. Let E :V ^V be the linear
mapping defined by „, , , , , , ,
E(V) = {V, Mi>Mi + (V, M2>M2 + ' ' • + {V, ll^U,
Show that E Is the orthogonal projection of V onto W.
r
13.70. Let {tti mJ be an orthonormal subset of V. Show that, for any v€.V, 2 K^'.Wi)!^  ll^'ll^
(This Is known as Bessel's Inequality.) "
13.71. Let y be a real inner product space. Show that:
(I) m = ll'i'll If and only If {u + v,u — v) = Q;
(II) ti + 1^112 = m2 + 11^112 If and only If (u,v) = 0.
Show by counterexamples that the above statements are not true for, say, C^.
13.72. Let U and W be subspaces of a finite dimensional inner product space V. Show that: {i) (U+W) =
U^ nW^; (11) (UnW)^ = U^ + W^.
ADJOINT OPERATOR
13.73. Let r : R3 ^ R3 be defined by Tix, y, z) = (x + 2y, Zx  Az, y). Find T* (x, y, «).
13.74. Let r : C3 ^ 03 be defined by
T(,x, y, z) = (ix + (2 + Zi)y, 3a! + (3  i)z, (2  hi)y + iz)
Find T*{x,y,z).
13.75. For each of the following linear functions ^ on F find a vector uG.V such that <i>(v) = {v, u) for
every v G V:
(i)
(ii)
(iii) ^
R3 > R defined by <i>{x, y,z) — x + 2y — 32.
C3 » C defined by <f>{x, y, z)  ix + (2 + St)y + (1  2i)z.
y » R defined by <p(f) — /(I) where V is the vector space of Problem 13.63.
13.76. Suppose V has finite dimension. Prove that the image of T* is the orthogonal complement of the
kernel of T, i.e. Im T* = (Ker T)^ . Hence rank(r) = rank(r*).
13.77. Show that T*T = implies 7 = 0.
13.78. Let V be the vector space of polynomials over R with inner product defined by (/, ff> = I f(t) g(t) dt.
Let D be the derivative operator on V, i.e. D{f) = df/dt. Show that there is no operator D* on V
such that (D(f),g) = {f,D*(g)) for every f,g &V. That is, D has no adjoint.
UNITARY AND ORTHOGONAL OPERATORS AND MATRICES
13.79. Find an orthogonal matrix whose first row is: (1) (l/VS, 2/v'5); (il) a multiple of (1,1,1).
13.80. Find a symmetric orthogonal matrix whose first row is (1/3,2/3,2/3). (Compare with Problem
13.27.)
13.81. Find a unitary matrix whose first row is: (1) a multiple of (1, 1 — t); (ii) (\, \i, ^ — Ji)
13.82. Prove: The product and inverses of orthogonal matrices are orthogonal. (Thus the orthogonal
matrices form a group under multiplication called the orthogonal group.)
13.83. Prove: The product and Inverses of unitary matrices are unitary. (Thus the unitary matrices
form a group under multiplication called the unitary group.)
13.84. Show that if an orthogonal (unitary) matrix is triangular, then it is diagonal.
310 INNER PRODUCT SPACES [CHAP. 13
13.85. Recall that the complex matrices A and B are unitarily equivalent if there exists a unitary matrix
P such that B = P*AP. Show that this relation is an equivalence relation.
13.86. Recall that the real matrices A and B are orthogonally equivalent if there exists an orthogonal
matrix P such that B — P*AP. Show that this relation is an equivalence relation.
13.87. Let W be a subspace of V. For any v&V let v = w + w' where wGWyW'e W^ . (Such a sum
is unique because V=W®W^.) Let T.V^V be defined by T(v) =ww'. Show that T is
a selfadjoint unitary operator on V.
13.88. Let V be an inner product space, and suppose U : V * V (not necessarily linear) is surjective (onto)
and preserves inner products, i.e. {U(v), U(w)) = {u,w) for every v,w&V. Prove that U is
linear and hence unitary.
POSITIVE AND POSITIVE DEFINITE OPERATORS
13.89. Show that the sum of two positive (positive definite) operators is positive (positive definite).
13.90. Let r be a linear operator on V and let f.V^V^K be defined by f{u, v) = (T{u), v). Show
that / is itself an inner product on V if and only if T is positive definite.
13.91. Suppose E is an orthogonal projection onto some subspace W of V. Prove that kl + E is positive
(positive definite) if A; a= (A; > 0).
13.92. Prove Theorem 13.13B, page 288, on positive definite operators. (The corresponding Theorem
13.13A for positive operators is proved in Problem 13.41.)
13.93. Consider the operator T defined by T(Ui) = VTjMi, i = 1, . . .,«, in the proof of Theorem 18^3A
(Problem 13.41). Show that T is positive and that it is the only positive operator for which T^  P.
13.94. Suppose P is both positive and unitary. Prove that P = /.
13.95. An « X M (real or complex) matrix A = (ajj) is said to be positive if A viewed as a linear operator
on K» is positive. (An analogous definition defines a positive definite matrix.) Prove A is positive
(positive definite) if and only if ay = a^ and
n
2 ayiCjSc]  (>0)
i,3 = 1
for every (a;i, ...,«;„) in K^.
13.96. Determine which of the following matrices are positive (positive definite):
(i) (ii) (iii) (iv) (v) (vi)
13.97. Prove that a 2 X 2 complex matrix A = [ ) is positive if and only if (i) A= A*, and
(ii) a, d and ad — be are nonnegative real numbers.
13.98. Prove that a diagonal matrix A is positive (positive definite) if and only if every diagonal entry is
a nonnegative (positive) real number.
SELFADJOINT AND SYMMETRIC OPERATORS
13.99. For any operator T, show that T + T* is self adjoint and T  T* is skewadjoint.
13.100. Suppose T is selfadjoint. Show that THv) = implies T(v) = 0. Use this to prove that
THv) = also implies T(v) = lor w > 0.
CHAP. 13] INNER PRODUCT SPACES 311
13.101. Let F be a complex inner product space. Suppose (T(v), v) is real for every v G V. Show that T
is selfadjoint.
13.102. Suppose S and T are selfadjoint. Show that ST is selfadjoint if and only if S and T commute,
i.e. ST = TS.
13.103. For each of the following symmetric matrices A, And an orthogonal matrix P for which P'AP is
diagonal:
13.104. Find an orthogonal transformation of coordinates which diagonalizes each quadratic form:
(i) q{x, y) = 2x^ — 6xy + lOy^, (ii) q{x, y) = x'^ \ Sxy — 5y^
13.105. Find an orthogonal transformation of coordinates which diagonalizes the quadratic form
q(x, y, z) = 2xy + 2xz + 2yz.
NORMAL OPERATORS AND MATRICES
/2 i\
13.106. Verify that A = I . 1 is normal. Find a unitary matrix P such that P*AP is diagonal, and
find P*AP. ^.* ''^
13.107. Show that a triangular matrix is normal if and only if it is diagonal.
13.108. Prove that if T is normal on V, then r('y) = r*('u) for every vGY. Prove that the converse
holds in complex inner product spaces.
13.109. Show that selfadjoint, skewadjoint and unitary (orthogonal) operators are normal.
13.110. Suppose T is normal. Prove that:
(i) T is selfadjoint if and only if its eigenvalues are real.
(ii) T is unitary if and only if its eigenvalues have absolute value 1.
(iii) T is positive if and only if its eigenvalues are nonnegative real numbers.
13.111. Show that if T is normal, then T and T* have the same kernel and the same image.
13.112. Suppose S and T are normal and commute. Show that S+T and ST are also normal.
13.113. Suppose T is normal and commutes with S. Show that T also commutes with S*.
13.114. Prove: Let S and T be normal operators on a complex finite dimensional vector space V. Then
there exists an orthonormal basis of V consisting of eigenvectors of both S and T. (That is, S and
T can be simultaneously diagonalized.)
ISOMORPHISM PROBLEMS
13.115. Let {ei, . . . , e„} be an orthonormal basis of an inner product space V over K. Show that the map
V ^[v]g is an (inner product space) isomorphism between V and X". (Here [v]^ denotes the co
ordinate vector of v in the basis {cj}.)
13.116. Show that inner product spaces V and W over K are isomorphic if and only if V and W have the
same dimension.
13.117. Suppose {ej, ...,ej and {ei . . . , e^} are orthonormal bases of V and W respectively. Let T : V ^ W
be the linear map defined by T{ei) = e(, for each i. Show that T is an isomorphism.
312 INNER PRODUCT SPACES [CHAP. 13
13.118. Let V be an inner product space. Recall (pag:e 283) that each uG V determines a linear functional
u in the dual space V* by the definition u (v) = {v, u) for every v E.V. Show that the map
M H M is linear and nonsingular, and hence an isomorphism from V onto V*.
13.119. Consider the inner product space V of Problem 13.54. Show that V is isomorphic to R"*" under the
mapping
/«U «12 •• • «ln \
\^m\ ^m2 • • • ^mnl
where Ri = (ctii, ajj, . . .,«i„), the fth row of A.
MISCELLANEOUS PROBLEMS
13.120. Show that there exists an orthonormal basis {mj, . . .,m„} of V consisting of eigenvectors of T if and
only if there exist orthogonal projections Ei,...,E^ and scalars \i,...,\ such that: (i) T =
\iEi + • • ■ + X^^; (ii) Ei+ ■■■ + Er = I; (iii) EiEj = for i ¥■ j.
13.121. Suppose V = U®W and suppose TiiU^V and T^:W^V are linear. Show that T =
Ti © T2 is also linear. (Here T is defined as follows: if t) e V and v = u + w where uG:U,wG.W,
then T(v) = T.^(u) + TgC^))
13.122. Suppose U is an orthogonal operator on R3 with positive determinant. Show that U is either a
rotation or a reflection through a plane.
Answers to Supplementary Problems
13.47. (ii) /c > 9; (iii) a > 0, d > 0, ad  6c >
13.48. (i) VE, (ii) VH
13.50. (i) 3V2, (ii) 5V2
13.56. (i) IJMll =V65/12, (ii) i; = 2\/ll, (iii) \\f{t)\\ = ^/83/15, (iv) A = V30
13.60. (i) cos e = 9/\/420, (ii) cos e = \/l5/6, (iii) cos e  2/y/2\a
13.61. {vi = (1,2,1,0), v^ = (4,4,0,1)}
13.62. {vi = (1, i, 1)/VS, V2 = (2i, 1  Si, 3  i)/V24}
13.63. (i) {/i it) = 7t2  5«, fzit) = 12*2 _ 5}
(ii) {Mj(t) = 1, M2(t) = (2t  l)/\/3, M3(t) = (6*2  6f + 1)/a/5 }
, r / 7/V6
13.67. (i) (0,l/\/2, 1/^/2), (ii) (26 + 7t, 27 + 24t)/V14, (iii) V5 tVS, (iv) i _ r ...^
13.73. r*(a;, y, z)  (« + 82/, 2a; + x, 4j/)
13.74. T*{«,y,z) = (iae + Sj/, (2  3t)a; + (2 + 5i)z, (3 + %  iz)
CHAP. 13] INNER PRODUCT SPACES 313
13.75. Let u = ^(ei)ei + • • ■ + 0(e„)e„ where {ej} is an orthonormal basis.
(i) u = (1, 2, 3), (ii) u = ii, 2  3i, 1 + 20, (iii) u = (18t2 8t + 13)/15
^ / \2/\/6 1/V6 l/x/e/
'l/3 2/3 2/3^
13.80. 1 2/3 2/3 1/3
,2/3 1/3 2/3/
/ l/x/3 (l0/V3\ / * *^ *"*'■'
13.96. Only (i) and (v) are positive. Moreover, (v) is positive definite.
1.103. (i) p =. ( 2/^ 1/f y (ii) p = f 2/f i/^y (iii) p = I ^'^ ^'^
l/\/5 2/V^/ \l/\/5 2/\/5/ \l/v^ 3/\/T0
13.104. (i) X = (3a;'  j/')/\/lO, v = (a:' + 3j/')/\/l0, (ii) x = (2a:'  2/')/\/5, y = (x' + 2j/')/a/5
13.105. a; = x'Jy/l + y'lyfl + 2'/v/6, y = a;'/v/3  j/'/^/B + z't^f^, z = a;'/\/3  ^z'l^T^
[llyfi l/\/2\ /2 + i
13.106. P = V V . ^*^P = ('■^ '
\1/V2 1/a/2/'
2  i
Appendix A
Sets and Relations
SETS, ELEMENTS
Any well defined list or collection of objects is called a set; the objects comprising the
set are called its elements or members. We write
p G A if p is an element in the set A
If every element of A also belongs to a set B, i.e. if a; G A implies x G B, then A is called
a subset of B or is said to be contained in B; this is denoted by
AgB or BdA
Two sets are equal if they both contain the same elements; that is,
A = B if and only if AcB and BcA
The negations of p GA, AcB and A — B are written p ^ A, A^B and A¥'B
respectively.
We specify a particular set by either listing its elements or by stating properties which
characterize the elements in the set. For example,
A  (1,3,5,7,9}
means A is the set consisting of the numbers 1, 3, 5, 7 and 9; and
B — (a; : a; is a prime number, x < 15}
means that B is the set of prime numbers less than 15. We also use special symbols to
denote sets which occur very often in the text. Unless otherwise specified:
N = the set of positive integers: 1, 2, 3, ... ;
Z = the set of integers: ...,—2,1,0,1,2,...;
Q = the set of rational numbers;
R = the set of real numbers;
C = the set of complex numbers.
We also use to denote the empty or null set, i.e. the set which contains no elements; this
set is assumed to be a subset of every other set.
Frequently the members of a set are sets themselves. For example, each line in a set
of lines is a set of points. To help clarify these situations, we use the words cUiss, collection
and family synonymously with set. The words subclass, subcoUection and subfamily have
meanings analogous to subset.
Example A.l : The sets A and B above can also be written as
A = {a; G N : a; is odd, a; < 10} and B = {2,3,5,7,11,13}
Observe that 9GA but 9 g B, and 11GB but 11 € A; whereas 3GA and
3 GB, and 6 € A and 6 g B.
Example A.2
Example A.3
Example A.4
The sets of numbers are related as follows: NcZcQcRcC.
Let C = {x : x^ = 4, X is odd}. Then C = 0, that is, C is the empty set.
The members of the class {{2, 3}, {2}, {5, 6}} are the sets {2, 3}, {2} and {5, 6}.
315
316
SETS AND RELATIONS
[APPENDIX A
The following theorem applies.
Theorem A.l: Let A, B and C be any sets. Then: (i) Ac A; (ii) if AcB and B(zA,
then A = B; and (iii) if A c B and BcC, then AcC.
We emphasize that AcB does not exclude the possibility that A = B. However, if
AcB but A¥' B, then we say that A is a proper subset of B. (Some authors use the
symbol c for a subset and the symbol c only for a proper subset.)
When we speak of an indexed set {at: i G I}, or simply {Oi}, we mean that there is a
mapping ^ from the set / to a set A and that the image 4>{i) oi i & I is denoted (M. The set
/ is called the indexing set and the elements a, (the range of <j>) are said to be indexed by /.
A set (tti, a2, . . . } indexed by the positive integers N is called a sequence. An indexed class
of sets {Ai : i G I), or simply (Ai), has an analogous meaning except that now the map 4,
assigns to each i G I a set Ai rather than an element a,.
SET OPERATIONS
Let A and B be arbitrary sets. The union of A and B, written AuB, is the set of
elements belonging to A or to B; and the intersection of A and B, written AnB, is the set
of elements belonging to both A and B:
AUB  {x: xG A or xGB} and AnB = {x: x GA and x G B}
If AnB = 0, that is, if A and B do not have any elements in common, then A and B are
said to be disjoint.
We assume that all our sets are subsets of a fixed universal set (denoted here by U).
Then the complement of A, written A'=, is the set of elements which do not belong to A:
A<= = {X gU: x^A)
ExamplA AJ5: The following diagrams, called Venn diagrams, illustrate the above set operations.
Here sets are represented by simple plane areas and U, the universal set, by the
area in the entire rectangle.
AuB is shaded
AnB is shaded
QD
A<^ is shaded
APPENDIX A] SETS AND RELATIONS 317
Sets under the above operations satisfy various laws or identities which are listed in
the table below. In fact, we state
Theorem A.2: Sets satisfy the laws in Table 1.
LAWS OF THE ALGEBRA OF SETS
la.
Idempotent Laws
AuA = A lb. AnA = A
2a.
Associative Laws
(AuB)uC  Au(BuC) 2b. (AnB)nC = An(BnC)
3a.
Commutative Laws
AuB  BuA 3b. AnB = BnA
4a.
Distributive Laws
Au(BnC) = (AuB)n(AuC) 4b. An(BuC) = (AnB)u(AnC)
5a.
6a.
Identity Laws
Au0 = A 5b. AnU = A
AdU = U 6b. An0 =
7a.
8a.
Complement Laws
AuA= = U 7b. AnA<; =
(A':)<= = A 8b. U<: = 0, 0c = [7
9a.
De Morgan's Laws
(A\jBY = A'ni?» 9b. {AnBY = A<'uB<=
Table 1
Remark: Each of the above laws follows from an analogous logical law. For example,
AnB = [x: xGA and x G B} = {x: xGB and x G A) = BnA
(Here we use the fact that the composite statement "p and q", written p a g, is
logically equivalent to the composite statement "q and p", i.e. q a p.)
The relationship between set inclusion and the above set operations follows.
Theorem A.3: Each of the following conditions is equivalent to AcB:
(i) AnB = A (iii) B<^cA<^ (v)BuA==C7
(ii) AuB = i? (iv) AnB' =
We generalize the above set operations as follows. Let [Ai : i S 7} be any family of
sets. Then the union of the Ai, written U^^,A^ (or simply UjAi), is the set of elements
each belonging to at least one of the Ai; and the intersection of the At, written n^^^A^ or
simply n i Ai, is the set of elements each belonging to every Ai.
PRODUCT SETS
Let A and B be two sets. The product set of A and B, denoted hy AxB, consists of all
ordered pairs (a, 6) where aG A and b G B:
AxB = {{a, b): aGA, bGB}
The product of a set with itself, say Ax A, is denoted by A".
318
SETS AND RELATIONS
[APPENDIX A
Example A.6: The reader is familiar with the cartesian plane R^ = R x R as shown below. Here
each point P represents an ordered pair {a, b) of real numbers, and vice versa.
3
1
• P
1 2^3
Example A.7: Let A = {1, 2, 3} and B = {a, 6}. Then
AXB = {(1, a), (1, 6), (2, a), (2, 6), (3, a), (3, b)}
Remark: The ordered pair (a, 6) is defined rigorously by {a, b) = {{a}, {a, b}}. From this
definition, the "order" property may be proven; that is, {a, b) = (c, d) if and only
if a = c and b = d.
The concept of product set is extended to any finite number of sets in a natural way.
The product set of the sets Ai, . . . , Am, written Ai x A2 x • • • x Am, is the set consisting of
all TOtuples (ci, a2, . . . , fflm) where ai G A. for each i.
RELATIONS
A binary relation or simply relation R from a set A to a set B assigns to each ordered
pair {a,b) G Ax B exactly one of the following statements:
(i) "a is related to b", written aRb,
(ii) "a is not related to b", written a^b.
A relation from a set A to the same set A is called a relation in A.
Example A.8: Set inclusion is a relation in any class of sets. For, given any pair of sets A and B,
either A cB or A ():B.
Observe that any relation R from A to B uniquely defines a subset R of Ax B as follows:
R = {(a, b): aRb}
Conversely, any subset R ot AxB defines a relation from A to B as follows:
aRb if and only if (a, b) G R
In view of the above correspondence between relations from A to B and subsets of A x B,
we redefine a relation as follows:
Definition: A relation i? from A to B is a subset of A x 5.
EQUIVALENCE RELATIONS
A relation in a set A is called an equivalence relation if it satisfies the following axioms:
[El] Every a € A is related to itself.
[E2] If a is related to 6, then b is related to a.
[Es] If a is related to b and b is related to c, then a is related to c.
In general, a relation is said to be reflexive if it satisfies [Ei], symmetric if it satisfies [Ez],
and transitive if it satisfies [£"3]. In other words, a relation is an equivalence relation if
it is reflexive, symmetric and transitive.
APPENDIX A]
SETS AND RELATIONS
319
Example A.9: Consider the relation C of set inclusion. By Theorem A.l, A c A for every set A;
and a AqB and B<zC, then A c C. That is, C is both reflexive and transitive.
On the other hand, C is not symmetric, since A c B and A ¥= B implies B cjiA.
Example A.IO: In Euclidean geometry, similarity of triangles is an equivalence relation. For if
a, p and V are any triangles, then: (i) a is similar to itself; (ii) if a is similar to yS,
then p is similar to a; and (iii) if a is similar to p and /8 is similar to y, then a is
similar to y.
If R is an equivalence relation in A, then the equivalence class of any element a G A,
denoted by [a], is the set of elements to which a is related:
[a] = [x: aRx)
The collection of equivalence classes, denoted by A/R, is called the quotient of A by R:
AIR = {[a]: a G A}
The fundamental property of equivalence relations follows:
Theorem A.4: Let R be an equivalence relation in A. Then the quotient set A/R is a
partition of A, i.e. each aGA belongs to a member of A/R, and the mem
bers of A/R are pairwise disjoint.
Example A.11 : Let R5 be the relation in Z, the set of integers defined by
X = y (mod 5)
which reads "x is congruent to y modulo 5" and which means "x  y is divisible by 5".
Then ^5 is an equivalence relation in Z. There are exactly five distinct equivalence
classes in Z/R^:
Ao == {...,10,5,0,5,10}
Ai = {...,9,4,1,6,11}
A2 = {...,8,3,2,7,12}
A3 = {...,7,2,3,8,13}
A4 = {...,6,1,4,9,14}
Now each integer x is uniquely expressible in the form x = 5q + r where  r < 5;
observe that x G E^ where r is the remainder. Note that the equivalence classes
are pairwise disjoint and that Z = A0UA1UA2UA3UA4.
Appendix B
Algebraic Structures
INTRODUCTION
We define here algebraic structures which occur in almost all branches of mathematics.
In particular we will define a field which appears in the definition of a vector space. We
begin with the definition of a group, which is a relatively simple algebraic structure with
only one operation and is used as a building block for many other algebraic systems.
GROUPS
Let G be a nonempty set with a binary operation, i.e. to each pair of elements a,b GG
there is assigned an element ab G G. Then G is called a group if the following axioms hold:
[Gi] For any a,b,c G G, we have {ab)c = a{bc) (the associative law).
[G2] There exists an element e GG, called the identity element, such that ae = ea = a for
every a GG.
[Ga] For each a GG there exists an element a'^GG, called the inverse of a, such that
aa~^ = a~^a = e.
A group G is said to be abelian (or: commutative) if the commutative law holds, i.e. if
ab = ha for every a,h GG.
When the binary operation is denoted by juxtaposition as above, the group G is said
to be written multiplicatively. Sometimes, when G is abelian, the binary operation is de
noted by + and G is said to be written additively. In such case the identity element is
denoted by and is called the zero element; and the inverse is denoted by —a and is called
the negative of a.
If A and B are subsets of a group G then we write
AB = {ab: aGA, bGB}, or A + B = {a + b: a G A, b G B}
We also write a for {a}.
A subset H of a group G is called a subgroup of G if H itself forms a group under the
operation of G. If ^ is a subgroup of G and aGG, then the set Ha is called a right coset
of H and the set aH is called a left coset of H.
Definition: A subgroup H of G is called a normal subgroup if a^HacH for every aGG.
Equivalently, H is normal if aH = Ha for every aGG, i.e. if the right and
left cosets of H coincide.
Note that every subgroup of an abelian group is normal.
Theorem B.1: Let f? be a normal subgroup of G. Then the cosets of i? in G form a group
under coset multiplication. This group is called the quotient group and is
denoted by G/H.
Example B.1 : The set Z of integers forms an abelian group under addition. (We remark that the
even integers form a subgroup of Z but the odd integers do not.) Let H denote the
set of multiples of 5, i.e. H = {. . ., 10, 5, 0, 5, 10, . . .}. Then H is a subgroup
(necessarily normal) of Z. The cosets of H in Z follow:
320
APPENDIX Bj
ALGEBRAIC STRUCTURES
321
Example B^:
= + H = H = {...,10,5,0,5,10,...}
1 = 1 + i? = {. . ., 9, 4, 1, 6, 11, . . .}
2 = 2 + H = {..., 8, 3, 2, 7, 12, . . .}
3 = 3 + H = {. . ., 7, 2, 3, 8, 13, . . .}
4 = 4, + H = {...,6,1,4,9,14, ...}
For any other integer w S Z, n = w + H coincides with one of the above cosets.
Thus by the above theorem, Z/H = {0, 1, 2, 3, 4} forms a group under coset addition;
its addition table follows:
+
T
2
3
4
T
2
3
4
T
T
2
3
4
2
2
3
4
1
3
3
4
I
2
4
4
1
2
3
This quotient group Z/H is referred to as the integers modulo 5 and is frequently
denoted by Z5. Analogously, for any positive integer n, there exists the quotient
group Z„ called the integers modulo n.
The permutations of n symbols (see page 171) form a group under composition of
mappings; it is called the symmetric group of degree n and is denoted by S„. We
investigate S3 here; its elements are
<'i
"2
"a
<t>i
02 —
(3 a \)
(III)
Here
n 2 3\
f . . , 1 is the permutation which maps 1 ►» t, 2 I* j,
3
} k
tiplication table of S3 is
The mul
e
<'l
«'2
"3
01
02
6
e
"1
"2
"3
01
02
"1
"1
€
01
02
"2
''S
"2
"2
H
e
01
''S
"1
"3
"3
01
02
e
"1
"2
01
•Pi
•'a
<fl
"2
02
e
<f>2
02
"2
"3
<'l
€
01
(The element in the ath row and 6th column is ab.) The set H = {e, ffj is a sub
group of S3; its right and left cosets are
Right Cosets Left Cosets
H = {e,<r,} H = {e,„,}
H^l = {01, 02} <f>iH — {^j, org}
H<t>2 = {02, tfs} 02^ = {02><'2}
Observe that the right cosets and the left cosets are distinct; hence H is not a normal
subgroup of S3.
A mapping / from a group G into a group G' is called a homomorphism if /(a6) =
/(a)/(b) for every a.bGG. (If / is also bijective, i.e. onetoone and onto, then / is called
an isomorphism and G and G' are said to be isomorphic.) If f:G*G' is a homomorphism,
then the feerraei of / is the set of elements of G which map into the identity element e' e G':
kernel of / = {aGG: f(a) = e'}
(As usual, /(G) is called the image of the mapping /: G^G'.) The following theorem
applies.
Theorem B.2: Let /: G» G' be a homomorphism with kernel K. Then X is a normal
subgroup of G, and the quotient group GIK is isomorphic to the image of /.
322 ALGEBRAIC STRUCTURES [APPENDIX B
Example B^: Let G be the group of real numbers under addition, and let G' be the group of
positive real numbers under multiplication. The mapping f : G * G' defined by
/(a) — 2" is a homomorphism because
f(a+b) = 2° + " = 2''2i' = f{a)f(b)
In particular, / is bijective; hence G and G' are isomorphic.
Example B.4: Let G be the group of nonzero complex numbers under multiplication, and let G'
be the group of nonzero real numbers under multiplication. The mapping f : G * G'
defined by f(z) — \z\ is a homomorphism because
/(ziZa) = ziZ2 = zi [zal = /(^i) f(H)
The kernel K of f consists of those complex numbers z on the unit circle, i.e. for
which \z\ = 1. Thus G/K is isomorphic to the image of /, i.e. to the group of positive
real numbers under multiplication.
RINGS, INTEGRAL DOMAINS AND FIELDS
Let i? be a nonempty set with two binary operations, an operation of addition (denoted
by +) and an operation of multiplication (denoted by juxtaposition). Then R is called a
ring if the following axioms are satisfied:
[Ri] For any a,b,e G R, we have {a + h) + c = a + (6 + c).
[Ri] There exists an element G /?, called the zero element, such that a + = + a = a
for every aGR.
[Rs\ For each a G J? there exists an element —a G R, called the negative of a, such that
a + (—a) = (—a) + a = 0.
[R^ For any a,b G R, we have a + b = b + a.
[Rs] For any a,b,cG R, we have {ab)c = a{bc).
[Re] For any a,b,c G R, we have:
(i) a{b + c) = ab + ac, and (ii) (b + e)a =ba + ca.
Observe that the axioms [Ri] through [Rt] may be summarized by saying that R is an
abelian group under addition.
Subtraction is defined iniJby a — b = a + (— &).
It can be shown (see Problem B.25) that a = • a = for every a G R.
R is called a commutative ring if ab — ba for every a,b G R. We also say that R is
a ring with a unit element if there exists a nonzero element 1 G R such that o • 1 = 1 • a = o
for every a G R.
A nonempty subset S of i? is called a subring of R if S itself forms a ring under the
operations of R. We note that S is a subring of R if and only if a, & G S implies ab GS
and ab G S.
A nonempty subset / of jB is called a left ideal in R if: (i) a — 6 G / whenever a,b G I,
and (ii) ra G I whenever r GR, aG I. Note that a left ideal I in R is also a subring of R.
Similarly we can define a right ideal and a twosided ideal. Clearly all ideals in com
mutative rings are twosided. The term ideal shall mean twosided ideal unless otherwise
specified.
Theorem B.3: Let / be a (twosided) ideal in a ring R. Then the cosets {a + I: aGR}
form a ring under coset addition and coset multiplication. This ring is
denoted by R/I and is called the quotient ring.
Now let R he a commutative ring with a unit element. For any aGR, the set
(a) = {ra: r G R} is an ideal; it is called the principal ideal generated by a. If every ideal
in iZ is a principal ideal, then R is called a principal ideal ring.
Definition: A commutative ring R with a unit element is called an integral domain it R
has no zero divisors, i.e. if ab — implies a = or b = 0.
APPENDIX B]
ALGEBRAIC STRUCTURES
323
Definition: A commutative ring R with a unit element is called a field if every nonzero
a E R has a multiplicative inverse, i.e. there exists an element a~^ E R such
that aa~i = a~^a — 1.
A field is necessarily an integral domain; for if ab = and a t^ 0, then
b = I'b = a^ab  a^'O =
We remark that a field may also be viewed as a commutative ring in which the nonzero
elements form a group under multiplication.
Example B5: The set Z of integers with the usual operations of addition and multiplication is the
classical example of an integral domain with a unit element. Every ideal / in Z is
a principal ideal, i.e. / = (m) for some integer n. The quotient ring Z„ = Z/(ji)
is calle^ the ring of integers modulo n. If n is prime, then Z„ is a field. On the
other hand, if n is not prime then Z„ has zero divisors. For example, in the ring Zg,
2 3=0 and 2^0 and 3 # 0.
Example B.6: The rational numbers Q and the real numbers R each form a field with respect
to the usual operations of addition and multiplication.
Example B.7: Let C denote the set of ordered pairs of real numbers with addition and multiplica
tion defined by
(a, 6) + (c, d) = (a + e,b + d)
(a, 6) '(c, d) = {ae —bd, ad + be)
Then C satisfies all the required properties of a field. In fact, C is just the field of
complex numbers (see page 4).
Example B.8: The set M of all 2 by 2 matrices with real entries forms a noncommutative ring with
zero divisors under the operations of matrix addition and matrix multiplication.
Example B.9: Let R be any ring. Then the set jB[a;] of all polynomials over R forms a ring with
respect to the usual operations of addition and multiplication of polynomials.
Moreover, if R is an integral domain then R[x] is also an integ^ral domain.
Now let D be an integral domain. We say that b divides a in D if a = bc for some
c G D. An element u G D ia called a unit if u divides 1, i.e. if u has a multiplicative inverse.
An element b GD is called an associate of a G D if b = ua for some unit uG D. A
nonunit p G D is said to be irreducible if p = ab implies a or 6 is a unit.
An integral domain D is called a unique factorization domain if every nonunit a G D
can be written uniquely (up to associates and order) as a product of irreducible elements.
Example BJO: The ring Z of integers is the classical example of a unique factorization domain.
The units of Z are 1 and —1. The only associates of n G. Z are n and —n. The
irreducible elements of Z are the prime numbers.
Example B.ll : The set D = {a+ b^/JS : a, b integers} is an integral domain. The units of D
are ±1, 18 ± 5^13 and 18 ± 5\/l3. The elements 2, 3  Vl3 and 3  Vl3 are
irreducible in D. Observe that 4 = 2 • 2 = (3  Vl3 )(— 3  Vi3 ). Thus D is not
a unique factorization domain. (See Problem B.40.)
MODULES
Let M be a nonempty set and let Rhe a. ring with a unit element. Then M is said to be a
(left) Rmodule if M is an additive abelian group and there exists a mapping RxM* M
which satisfies the following axioms:
324 ALGEBRAIC STRUCTURES [APPENDIX B
[Ml] r(mi + mz) — rwi + rm2
[Mz] (r + s)m = rm + sm
[Ma] {rs)m = r{sm)
[M4] I'm — m
for any r,s GR and any mi G M.
We emphasize that an JRmodule is a generalization of a vector space where we allow
the scalars to come from a ring rather than a field.
Example B.12: Let G be any additive abelian group. We make G into a module over the ring Z of
integers by defining
n times
nff = ff + ff++ff, Oflr = 0, {n)ff = ng
where n is any positive integer.
Example B.13: Let iJ be a ring and let / be an ideal in R. Then / may be viewed as a module over R.
Example B.14: Let V be a vector space over a field K and let T : y » V be a linear mapping.
We make V into a module over the ring K[x\ of polynomials over K by defining
f(x)v = f(T) (v). The reader should check that a scalar multiplication has been
defined.
Let iJf be a module over R. An additive subgroup AT of iW is called a submodule of M
it uGN and kGR imply ku G N. (Note that N is then a module over R.)
Let M and M' be /2modules. A mapping T : M* M' is called a homomorphism (or:
Rhomomorphism or Rlinear) if
(i) T{u + v) = T{u) + T(v) and (ii) T{ku) = kT{u)
for every u,v G M and every kGR.
Problems
GROUPS
BJ. Determine whether each of the following systems forms a group G:
(i) G — set of integers, operation subtraction;
(ii) G = {1, —1}, operation multiplication;
(iii) G = set of nonzero rational numbers, operation division;
(iv) G = set of nonsingular nXn matrices, operation matrix multiplication;
(v) G = {a+bi: a,b e Z}, operation addition.
B.2. Show that in a group G:
(i) the identity element of G is unique;
(ii) each a S G has a unique inverse a~^ G G;
(iii) (oi)i = a, and (ab)^ = feiai;
(iv) 0.6 = ac implies b = c, and 6a = ca implies 6 = c.
B.3. In a group G, the powers of a G G are defined by
a" = e, a" = aa"~i, o~" = (a")~i, where nGN
Show that the following formulas hold for any integers r,8,tS Z: (i) a^'a^ — a^+'^ (ii) (a"")' = a",
(iii) («>•+«)« = a't+st.
APPENDIX B] ALGEBRAIC STRUCTURES 325
B.4. Show that if G is an abelian group, then (a&)» = a^bn for any a, 6 G G and any Integer « G Z.
B.5. Suppose G is a group such that {ab)^ = a^b^ for every a, 6 G G. Show that G is abelian.
B.6. Suppose if is a subset of a group G. Show that H is a subgroup of G if and only if (i) H is non
empty, and (ii) a,b G H implies o6~i G H.
B.7. Prove that the intersection of any number of subgroups of G is also a subgroup of G.
B.8. Show that the set of all powers of a G G is a subgroup of G; it is called the cyclic group generated
by a.
B.9. A group G is said to be cyclic if G is generated by some aG G, i.e. G = {a." : n G Z}. Show that
every subgroup of a cyclic group is cyclic.
B.IO. Suppose G is a cyclic subgroup. Show that G is isomorphic to the set Z of integers under addition
or to the set Z„ (of the integers modulo n) under addition.
B.ll. Let H be a subgroup of G. Show that the right (left) cosets of H partition G into mutually disjoint
subsets.
B.12. The order of a group G, denoted by G, is the number of elements of G. Prove Lagrange's theorem:
If H is a subgroup of a finite group G, then \H\ divides \G\.
B.13. Suppose \G\ — p where p is prime. Show that G is cyclic.
B.14, Suppose H and N are subgroups of G with N normal. Show that (i) HN is a subgroup of G and
(ii) HnN is a normal subgroup of G.
B.15. Let H be a subgroup of G with only two right (left) cosets. Show that H is a normal subgroup of G.
B.16. Prove Theorem B.l: Let H he a. normal subgroup of G. Then the cosets of H in G form a group
G/H under coset multiplication.
B.17. Suppose G is an abelian group. Show that any factor group G/H is also abelian.
B.18. Let f : G * G' be a group homomorphism. Show that:
(i) /(e) = e' where e and e' are the identity elements of G and G' respectively;
(ii) /(ai) = /(a)i for any a G G.
B.19. Prove Theorem B.2: Let f : G * G' be a group homomorphism with kernel K. Then K is a normal
subgroup of G, and the quotient group G/K is isomorphic to the image of /.
B.20. Let G be the multiplicative group of complex numbers z such that \z\ = 1, and let B be the additive
group of real numbers. Prove that G is isomorphic to R/Z.
B.21. For a fixed fir G G, let g : G ^ G be defined by g(a) = g^ag. Show that G is an isomorphism of
G onto G.
B.22. Let G be the multiplicative group of n X w nonsingular matrices over R. Show that the mapping
A h* A is a homomorphism of G into the multiplicative g:roup of nonzero real numbers.
B.23. Let G be an abelian group. For a fixed w G Z, show that the map a l» a" is a homomorphism
of G into G.
B.24. Suppose H and N are subgroups of G with N normal. Prove that HnN is normal in H and
H/{HnN) is isomorphic to HN/N.
RINGS
B.25. Show that in a ring R:
(i) o • = • a. = 0, (ii) o(6) = (a)b = ab, (iii) (o)(6) = ab.
B.26. Show that in a ring R with a unit element: (i) (— l)a = —a, (ii) (— 1)(— 1) = 1.
326 ALGEBRAIC STRUCTURES [APPENDIX B
B.27. Suppose a^ = a for every a e i?. Prove that i? is a commutative ring. (Such a ring is called a
Boolean ring.)
B.28. Let i? be a ring with a unit element. We make R into another ring R by defining a®b = a+b + l
and^ U'b = ab + a+b. (i) Verify that fi is a ring, (ii) Determine the 0element and 1element
of R.
B.29. Let G be any (additive) abelian group. Define a multiplication in G by ab = 0. Show that this
makes G into a ring.
B.30. Prove Theorem B.3: Let / be a (twosided) ideal in a ring R. Then the cosets {a + I:. a G R} form
a ring under coset addition and coset multiplication.
B.31. Let /j and I^ be ideals in R. Prove that /j + 1^ and hnlz are also ideals in R.
B.32. Let R and R' be rings. A mapping f : R ^ R' is called a homomorphism (or: ring homomorphism) if
(i) f(a + b) =■ f(a) + f{b) and (ii) f(ab) = /(a)/(6),
for every a,bGR. Prove that if f : R ^ R' is a homomorphism, then the set K = {r G R : f{r) = 0}
is an ideal in R. (The set K is called the kernel of /.)
INTEGRAL DOMAINS AND FIELDS
B.33. Prove that in an integral domain D, if ab = ae, a ¥= 0, then b = c.
B.34. Prove that F = {a + byji : a, b rational} is a field.
B.35. Prove that D = {a+ 6\/2 : a, b integers} is an integral domain but not a field.
B.36. Prove that a finite integral domain D is a field.
B.37. Show that the only ideals in a field K are {0} and K.
B.38. A complex number a + bi where a, b are integers is called a Gaussian integer. Show that the set G
of Gaussian integers is an integral domain. Also show that the units in G are ±1 and ±i.
B.39. Let D be an integral domain and let / be an ideal in D. Prove that the factor ring D/I is an integral
domain if and only if / is a prime ideal. (An ideal / is prime if ab G I implies aG I or b&I.)
B.40. Consider the integral domain D = {a + b\/l3 : a, b integers} (see Example B.ll). If a — a+ by/Ts ,
we define N(a) = a^lSb^. Prove: (1) NiaP) = N{a)N(fi); (ii) a is a unit if and only if N(a) = ±1;
(iii) the units of D are ±1, 18 ± 5\/l3 and 18 ± 5\/l3 ; (iv) the numbers 2, 3  a/13 and 3  y/Ts
are irreducible.
MODULES
B.41. Let M be an iJmodule and let A and B be submodules of M. Show that A+B and AnB are also
submodules of M.
B.42. Let M be an iZmodule with submodule N. Show that the cosets {u + N : u G M} form an iJmodule
under coset addition and scalar multiplication defined by r{u + N) = ru + N. (This module is de
noted by M/N and is called the quotient module.)
B.43. Let M and M' be Bmodules and let f : M * M' be an iZhomomorphism. Show that the set
K = {uGM: f(u) = 0} is a submodule of /. (The set K is called the kernel of /.)
B.44. Let M be an i?module and let E(M) denote the set of all fihomomorphism of M into itself. Define
the appropriate operations of addition and multiplication in E{M) so that E(M) becomes a ring.
Appendix C
Polynomials over a Field
INTRODUCTION
We will investigate polynomials over a field K and show that they have many properties
which are analogous to properties of the integers. These results play an important role
in obtaining canonical forms for a linear operator T on a vector space V over K.
RING OF POLYNOMIALS
Let X be a field. Formally, a polynomial / over K is an infinite sequence of elements
from K in which all except a finite number of them are 0:
/ = ( . . . , 0, On, . . . , ai, tto)
(We write the sequence so that it extends to the left instead of to the right.) The entry ak
is called the kth coefficient of /. If n is the largest integer for which a„ ¥ 0, then we say
that the degree of / is n, written
deg/ = n
We also call a„ the leading coefficient of /, and if a„ = 1 we call / a monic polynomial. On
the other hand, if every coefficient of / is then / is called the zero polynomial, written
/ = 0. The degree of the zero polynomial is not defined.
Now if g is another polynomial over K, say
g  {.. .,0,bm, . . .,bi,bo)
then the sum f + g is the polynomial obtained by adding corresponding coefficients. That
is, if m — n then
f + g = ( . . . , 0, a„, . . . , a™ + 6m, . . . , ai + 6i, ao + bo)
Furthermore, the product fg is the polynomial
fg = ( . . . , 0, anbm, . . . , aibo + aobi, Oobo)
that is, the kth coefl5cient Ck of fg is
Cfc = 2^ ttibki = aobk + aibki + • • • + akbo
i=0
The following theorem applies.
Theorem C.l: The set P of polynomials over a field K under the above operations of addi
tion and multiplication forms a commutative ring with a unit element
and with no zero divisors, i.e. an integral domain. If / and g are nonzero
polynomials in P, then deg (fg) = (deg /)(deg g).
327
328 POLYNOMIALS OVER A FIELD [APPENDIX C
NOTATION
We identify the scalar ao G X with the polynomial
ao = ( . . . , 0, tto)
We also choose a symbol, say t, to denote the polynomial
t = (...,0,1,0)
We call the symbol t an indeterminant. Multiplying t with itself, we obtain
t' = {.. ., 0, 1, 0, 0), t' = (. . ., 0, 1, 0, 0, 0), ...
Thus the above polynomial / can be written uniquely in the usual form
/ = Unt" + ■ • • + ait + ao
When the symbol t is selected as the indeterminant, the ring of polynomials over K is
denoted by
and a polynomial / is frequently denoted by f(t).
We also view the field X as a subset of K[t] under the above identification. This is pos
sible since the operations of addition and multiplication of elements of K are preserved
under this identification:
(...,0, ao) + (..., 0,6o) = {...,0, ao + bo)
(..., 0,ao) •(..., 0, 6o)  (...,0, aobo)
We remark that the nonzero elements of K are the units of the ring K[t].
We also remark that every nonzero polynomial is an associate of a unique monic poly
nomial. Hence if d and d' are monic polynomials for which d divides d' and d' divides d,
then d = d'. (A polynomial g divides a polynomial / if there is a polynomial k such that
/ = hg.)
DIVISIBILITY
The following theorem formalizes the process known as "long division".
Theorem C.2 (Division Algorithm) : Let / and g be polynomials over a field K with g ¥=0.
Then there exist polynomials q and r such that
/ = qg + r
where either r = or deg r < deg g.
Proof: If f — or if deg / < deg g, then we have the required representation
f = Og + f
Now suppose deg / — deg g, say
/ = Unt" + • • • +ait + ao and g = hmt^ + ■ ■ ■ + hit + bo
where a„, bm ?^ and n — m. We form the polynomial
Om
Then deg /i < deg /. By induction, there exist polynomials qi and r such that
/i = qig + r
APPENDIX C] POLYNOMIALS OVER A FIELD 329
where either r = or degr < deg g. Substituting this into {1) and solving for /,
/ = (qi+^t^Ag + r
which is the desired representation.
Theorem C.3: The ring K[t] of polynomials over a field X is a principal ideal ring. If / is
an ideal in K[t], then there exists a unique monic polynomial d which gen
erates /, that is, such that d divides every polynomial / G 7.
Proof. Let d be a polynomial of lowest degree in 7. Since we can multiply d by a non
zero scalar and still remain in 7, we can assume without loss in generality that d is a monic
polynomial. Now suppose / G 7. By Theorem C.2 there exist polynomials q and r such that
/ = qd + r where either r = or deg r < deg d
Now f,d G I implies qd G I and hence r = f — qd E I. But d is a polynomial of lowest
degree in 7. Accordingly, r = and / = qd, that is, d divides /. It remains to show that
d is unique. If d' is another monic polynomial which generates I, then d divides d' and d'
divides d. This implies that d = d', because d and d' are monic. Thus the theorem is
proved.
Theorem C.4: Let / and g be nonzero polynomials in K[t]. Then there exists a unique
monic polynomial d such that: (i) d divides / and g; and (ii) if d' divides
/ and g, then d' divides d.
Definition: The above polynomial d is called the greatest common divisor of / and g. If
d — 1, then / and g are said to be relatively prime.
Proof of Theorem CA. The set 7 = {mf + ng : m,nG K[t]} is an ideal. Let d be the
monic polynomial which generates 7. Note f,g G I; hence d divides / and g. Now suppose
d' divides / and g. Let / be the ideal generated by d'. Then f,g GJ and hence Icj.
Accordingly, d Gj and so d' divides d as claimed. It remains to show that d is unique.
If di is another (monic) greatest common divisor of / and g, then d divides di and di divides
d. This implies that d — di because d and di are monic. Thus the theorem is proved.
Corollary C.5: Let d be the greatest common divisor of the polynomials / and g. Then
there exist polynomials m and n such that d = mf + ng. In particular, if
/ and g are relatively prime then there exist polynomials m and n such
that mf + ng = 1.
The corollary follows directly from the fact that d generates the ideal
7 = {mf + ng:m,nGK[t]}
FACTORIZATION
A polynomial p G K[t] of positive degree is said to be irreducible if p — fg implies
/ or gr is a scalar.
Lemma C.6: Suppose p G K[t] is irreducible. If p divides the product fg of polynomials
f,g G K[t], then p divides f or p divides g. More generally, if p divides the
product of n polynomials /1/2. . .fn, then p divides one of them.
Proof. Suppose p divides fg but not /. Since p is irreducible, the polynomials / and
p must then be relatively prime. Thus there exist polynomials m,nG K[t] such that
mf + np — 1. Multiplying this equation by g, we obtain m,fg + npg = g. But p divides fg
and so mfg, and p divides npg; hence p divides the sum g = mfg + npg.
?i«i./i>'— r T .Tii?«'TJ?»7^T^i ""■"■" "■'??^'?2?CT5^';fTr.js"'w«5.a
830 POLYNOMIALS OVER A FIELD [APPENDIX C
Now suppose p divides /1/2. . ./«. If p divides /i, then we are through. If not, then by
the above result p divides the product /2. . ./«. By induction on n, p divides one of the poly
nomials A, . . . , /«. Thus the lemma is proved.
Theorem C.7 (Unique Factorization Theorem): Let / be a nonzero polynomial in K[t].
Then / can be written uniquely (except for order) as a product
/ = kpiP2. . .Pn
where k GK and the Pi are monic irreducible polynomials in K[t].
Proof: We prove the existence of such a product first. If / is irreducible or if / e K,
then such a product clearly exists. On the other hand, suppose f = gh where / and g are
nonscalars. Then g and h have degrees less than that of /. By induction, we can assume
g  kigig2...gr and h — kihihi. . .hs
where ki, kiGK and the gi and hj are monic irreducible polynomials. Accordingly,
/ = {kik%)gig2. . .grhjii. . .hs
is our desired representation.
We next prove uniqueness (except for order) of such a product for /. Suppose
/ = kpiP2. . .Pn = k'qiQi. . .Qm
where k,k' E.K and the Pu . . ., Pn, qi, . . .,qm are monic irreducible polynomials. Now pi
divides k'qi . . .qm. Since pi is irreducible it must divide one of the qi by the above lemma.
Say pi divides qi. Since pi and qi are both irreducible and monic, pi = qi. Accordingly,
kpi. . .pn = k'qi. . .qm
By induction, we have that n = m and P2 = 92, . . ., Pn = qm for some rearrangement of
the qi. We also have that k = k'. Thus the theorem is proved.
If the field K is the complex field C, then we have the following result which is known
as the fundamental theorem of algebra; its proof lies beyond the scope of this text.
Theorem C.8 (Fundamental Theorem of Algebra): Let /(<) be a nonzero polynomial
over the complex field C. Then f{t) can be written uniquely (except for
order) as a product
/(*) = k{tri){tr2)itrn)
where k, n G C, i.e. as a product of linear polynomials.
In the case of the real field B we have the following result.
Theorem C.9: Let f{t) be a nonzero polynomial over the real field R. Then f{f) can be
written uniquely (except for order) as a product
f{t) = kpi{t)p2{t) ■■■Pm{t)
where kGK and the Pi{t) are monic irreducible polynomials of degree one
or two.
INDEX
Abelian group, 320
Absolute value, 4
Addition,
in R", 2
of linear mappings, 128
of matrices, 36
Adjoint,
classical, 176
operator, 284
Algebra,
isomorphism, 169
of linear operators, 129
of square matrices, 43
Algebraic multiplicity, 203
Alternating,
bilinear forms, 262
multilinear forms, 178, 277
Angle between vectors, 282
Annihilator, 227, 251
Antisymmetric
bilinear form, 263
operator, 285
Augmented matrix, 40
Basis, 88
change of, 153
Bessel's inequality, 309
Bijective mapping, 123
Bilinear form, 261, 277
Binary relation, 318
Block matrix, 45
Bounded function, 65
C,4
C», 5
CayleyHamilton theorem, 201, 211
Canonical forms in
Euclidean spaces, 288
unitary spaces, 290
vector spaces, 222
CauchySchwarz inequality, 4, 10, 281
Cells, 45
Change of basis, 153
Characteristic,
equation, 200
matrix, 200
polynomial, 200, 203, 210
value, 198
vector, 198
Classical adjoint, 176
Codomain, 121
Coefficient matrix, 40
Cofactor, 174
Column,
of a matrix, 35
rank, 90
space, 67
vector, 36
Companion matrix, 228
Complex numbers, 4
Components, 2
Composition of mappings, 121
Congruent matrices, 262
Conjugate complex number, 4
Consistent linear equations, 31
Convex, 260
Coordinate, 2
vector, 92
Coset, 229
Cramer's rule, 177
Cyclic group, 325
Cyclic subspaces, 227
Decomposition,
direct sum, 224
primary, 225
Degenerate bilinear form, 262
Dependent vectors, 86
Determinant, 171
Determinantal rank, 195
Diagonal
matrix, 43
of a matrix, 43
Diagonalization,
Euclidean spaces, 288
unitary spaces, 290
vector spaces, 155, 199
Dimension, 88
Direct sum, 69, 82, 224
Disjoint, 316
Distance, 3, 280
Distinguished elements, 41
Division algorithm, 328
Domain,
integral, 322
of a mapping, 121
Dot product,
in C", 6
in R", 3
Dual
basis, 250
space, 249
Echelon form,
linear equations, 21
matrices, 41
331
332
INDEX
Echelon matrix, 41
Eigenspace, 198, 205
Eigenvalue, 198
Eigenvector, 198
Element, 315
Elementary,
column operation, 61
divisors, 229
matrix, 56
row operation, 41
Elimination, 20
Empty set, 315
Equality
of matrices, 36
of vectors, 2
Equations (see Linear equations)
Equivalence relation, 318
Equivalent matrices, 61
Euclidean space, 3, 279
Even
function, 83
permutation, 171
External direct sum, 82
Field, 323
Free variable, 21
Function, 121
Functional, 249
Gaussian integers, 326
Generate, 66
Geometric multiplicity, 203
GramSchmidt orthogonalization, 283
Greatest common divisor, 329
Group, 320
Hermitian,
form, 266
matrix, 266
Hilbert space, 280
Hom (V, U), 128
Homogeneous linear equations, 19
Homomorphism, 123
Hyperplane, 14
Ideal, 322
Identity,
element, 320
mapping, 123
matrix, 43
permutation, 172
Image, 121, 125
Inclusion mapping, 146
Independent
subspaces, 244
vectors, 86
Index
of nilpotency, 225
set, 316
Infective mapping, 123
Inner product, 279
Inner product space, 279
Integers modulo re, 323
Integral domain, 322
Intersection of sets, 316
Invariant subspace, 223
Inverse,
mapping, 123
matrix, 44, 176
Invertible,
linear operator, 130
matrix, 44
Irreducible, 323, 329
Isomorphism of
algebras, 169
groups, 321
inner product spaces, 286, 311
vector spaces, 93, 124
Jordan canonical form, 226
Kernel, 123, 321, 326
22space, 280
Line segment, 14, 260
Linear combination
of equations, 30
of vectors, 66
Linear dependence, 86
in R", 28
Linear equations, 18, 127, 176, 251, 282
Linear functional, 249
Linear independence, 86
in R", 28
Linear mapping, 123
matrix of, 160
rank of, 126
Linear operators, 129
Linear span, 66
Mapping, 121
linear, 123
Matrices, 35
addition, 36
augmented, 40
block, 45
change of basis, 153
coefficient, 40
column, 35
congruent, 262
determinant, 171
diagonal, 43
echelon, 41
equivalent, 61
Hermitian, 266
identity, 43
multiplication, 39
normal, 290
rank, 90
row, 35
row canonical form, 42, 68
row equivalent, 41
row space, 60
scalar, 43
scalar multiplication, 36
similar, 155
size, 35
INDEX
333
Matrices (cont.)
square, 43
symmetric, 65, 288
transition, 153
transpose, 39
triangular, 43
zero, 37
Matrix representation,
bilinear forms, 262
linear mappings, 150
Maximal independent set, 89
Minimal polynomial, 202, 212
Minkowski's inequality, 10
Minor, 174
Module, 323
Monic polynomial, 201
Multilinear, 178, 277
Multiplication of matrices, 37, 39
N (positive integers), 315
nspace, 2
ntuple, 2
Nilpotent, 225
Nonnegative semidefinite, 266
Nonsingular,
linear mapping, 127
matrix, 130
Norm, 279
in R«, 4
Normal operator, 286, 290, 303
Normal subgroup, 320
Normalized vector, 280
Null set, 315
Nullity, 126
Odd,
function, 73
permutation, 171
Onetoone mappings, 123
Onto mappings, 123
Operations with linear mappings, 128
Operators (see Linear operators)
Ordered pair, 318
Orthogonal
complement, 281
matrix, 287
operator, 286
vectors, 3, 280
Orthogonally equivalent, 288
Orthonormal, 282
Parallelogram law, 307
Parity, 171
Partition, 319
Permutations, 171
Polar form, 264, 307
Polynomials, 327
Positive
matrix, 310
operator, 288
Positive definite,
bilinear form, 265
matrix, 272, 310
operator, 288
Primary decomposition theorem, 225
Prime ideal, 326
Principal ideal, 322
Principal minor, 219
Product set, 317
Projection operator, 243, 308
orthogonal, 281
Proper
subset, 316
value, 198
vector, 198
Q (rational numbers), 315
Quadratic form, 264
Quotient,
group, 320
module, 326
ring, 322
set, 319
space, 229
K (real field), 315
R", 2
Rank,
bilinear form, 262
linear mapping, 126
matrix, 90, 195
Rational canonical form, 228
Relation, 318
Relatively prime, 329
Ring, 322
Row,
canonical form, 42
equivalent matrices, 41
of a matrix, 35
operations, 41
rank, 90
reduced echelon form, 41
reduction, 42
vector, 36
Scalar, 2, 63
mapping, 219
matrix, 43
Scalar multiplication, 69
of linear mappings, 128
of matrices, 36
Second dual space, 251
Self adjoint operator, 285
Set, 315
Sgn, 171
Sign of a permutation, 171
Signature, 265, 266
Similar matrices, 155
Singular mappings, 127
Size of a matrix, So
Skewadjoint operator, 285
Skewsymmetric bilinear form, 263
Solution,
of linear equations, 18, 23
space, 65
Span, 66
Spectral theorem, 291
Square matrices, 43
J
334
INDEX
Subgroup, 320
Subring, 322
Subset, 315
Subspace (of a vector space), 65
sum of, 68
Surjective mapping, 123
Sylvester's theorem, 265
Symmetric,
bilinear form, 263
matrix, 65
operator, 285, 288, 300
System of linear equations, 19
Trace, 155
Transition matrix, 153
Transpose,
of a linear mapping, 252
of a matrix, 39
Transposition, 172
Triangle inequality, 293
Triangular,
form, 222
matrix, 43
Trivial solution, 19
Union of sets, 316
Unique factorization, 323
Unit vector, 280
Unitarily equivalent, 288
Unitary,
matrix, 287
operator, 286
space, 279
Universal set, 316
Upper triangular matrix, 43
Usual basis, 88, 89
Vector, 63
in C", 5
in R", 2
Vector space, 63
Venn diagram, 316
Z (integers), 315
Z„ (ring of integers modulo w), 323
Zero,
mapping, 124
matrix, 37
of a polynomial, 44
solution, 19
vector, 3, 63
COLLEGE PHYSICS
including 625 SOLVED PROBLEMS
Edited by CAREL W. van der MERWE, Pli.D.,
Professor of Physics, New ^ork Univtrsity
COLLEGE CHEMISTRY
including 385 SOLVED PROBLEMS
Edited by JEROME L. ROSENBERG, Pli.D.,
Professor of Chemisfry, University of Pitisburgh
GENETICS
including 500 SOLVED PROBLEMS
By WILLIAM D. STANSFIELD, Ph.D.,
Depf. of iiological Sciences, Calif. Slate Polyteclt.
MATHEMATICAL HANDBOOK
including 2400 FORMULAS and 60 TABLES
By MURRAY R. SPIEGEL, Ph.D.,
Professor of Math., Rensselaer Polytech. Inst.
First Yr. COLLEGE MATHEMATICS
including 1850 SOLVED PROBLEMS
By FRANK AYRES, Jr., Ph.D.,
Professor of Mathematics, Dickinson College
COLLEGE ALGEBRA
including 1 940 SOLVED PROBLEMS
By MURRAY R. SPIEGEL, Ph.D.,
Professor of Math., Rensselaer Polytech. Inst.
TRIGOmMETRY
including 680 SOLVED PROBLEMS
By FRANK AYRES, Jr., Ph.D.,
Professor of Mathematics, Dickinson College
MATHEMATICS OF FINANCE
including 500 SOLVED PROBLEMS
By FRANK AYRES, Jr., Ph.D.,
Professor of Mathematics, Dickinson College
PROBABILITY
including 500 SOLVED PROBLEMS
By SEYMOUR IIPSCHUTZ, Ph.D.,
Assoc. Prof, of Moth., 7emp/e University
STATISTICS
including 875 SOLVED PROBLEMS
By MURRAY R. SPIEGEL, Ph.D.,
Professor of Math., Rensselaer Polytech. Inst.
ANALYTIC GEOMETRY
including 345 SOLVED PROBLEMS
By JOSEPH H. KINDLE, Ph.D.,
Professor of Mathematics, University of Cincinnati
DIFFERENTIAL GEOMETRY
including 500 SOLVED PROBLEMS
By MARTIN LIPSCHUTZ, Ph.D.,
Professor of Mathematics, University of Bridgeport
CALCULUS
including 1 1 75 SOLVED PROBLEMS
By FRANK AYRES, Jr., Ph.D.,
Professor of Mathemolics, Dickinson College
DIFFERENTIAL EQUATIONS
including 560 SOLVED PROBLEMS
By FRANK AYRES, Jr., Ph.D.,
Professor of Mathematics, Dickinson College
SET THEORY and Related Topics
including 530 SOLVED PROBLEMS
By SEYMOUR IIPSCHUTZ, Ph.D.,
Assoc. Prof, of Moth., Temple University
FINITE MATHEMATICS
including 750 SOLVED PROBLEMS
By SEYMOUR LIPSCHUTZ, Ph.D.,
Assoc. Prof, of Math., Temple University
MODERN ALGEBRA
including 425 SOLVED PROBLEMS
By FRANK AYRES, Jr., Ph.D.,
Professor of Mathematics, Dickinson College
LINEAR ALGEBRA
including 600 SOLVED PROBLEMS
By SEYMOUR LIPSCHUTZ, Ph.D.,
Assoc. Prof, of Math., Temple University
MATRICES
including 340 SOLVED PROBLEMS
By FRANK AYRES, Jr., Ph.D.,
Professor of Wothemolics, Diclcinson College
PROJECTIVE GEOMETRY
including 200 SOLVED PROBLEMS
By FRANK AYRES, Jr., Ph.D.,
Professor of Mathematics, Dickinson College
GENERAL TOPOLOGY
including 650 SOLVED PROBLEMS
By SEYMOUR LIPSCHUTZ, Ph.D.,
Assoc. Prof, of Math., Temple University
GROUP THEORY
including 600 SOLVED PROBLEMS
By B. BAUMSLAG, B. CHANDLER, Ph.D.,
Mathematics Dept., New York University
VECTOR ANALYSIS
including 480 SOLVED PROBLEMS
By MURRAY R. SPIEGEL, Ph.D.,
Professor of Math., Rensselaer Polytech. Inst.
ADVANCED CALCULUS
including 925 SOLVED PROBLEMS
By MURRAY R. SPIEGEL, Ph.D.,
Professor of Math., Rensselaer Polytech. Inst.
COMPLEX VARIABLES
including 640 SOLVED PROBLEMS
By MURRAY R. SPIEGEL, Ph.D.,
Professor of Math,, Rensselaer Polytech. Inst.
LAPLACE TRANSFORMS
including 450 SOLVED PROBLEMS
By MURRAY R. SPIEGEL, Ph.D.,
Professor of Math., Rensselaer Polytech. Inst.
NUMERICAL ANALYSIS
including 775 SOLVED PROBLEMS
By FRANCIS SCHEID, Ph.D.,
Professor of Mathematics, Boston University
DESCRIPTIVE GEOMETRY
including 1 75 SOLVED PROBLEMS
By MINOR C. HAWK, Heod of
Engineering Graphics Dept., Carnegie Inst, of Tech.
ENGINEERING MECHANICS
including 460 SOLVED PROBLEMS
By W. G. McLEAN, B.S. in E.E., M.S.,
Professor of Mechanics, Lafayette College
and E. W. NELSON, B.S. in M.E., M. Adm. E.,
Engineering Supervisor, Western Electric Co.
THEORETICAL MECHANICS
including 720 SOLVED PROBLEMS
By MURRAY R. SPIEGEL, Ph.D.,
Professor of Math., Rensselaer Polytech. Inst.
LAGRANGIAN DYNAMICS
including 275 SOLVED PROBLEMS
By D. A. WELLS, Ph.D.,
Professor of Physics, University at Cincinnati
STRENGTH OF MATERIALS
including 430 SOLVED PROBLEMS
By WILLIAM A. NASH, Ph.D.,
Professor of Eng. Mechanics, University of Florida
FLUID MECHANICS and HYDRAULICS
including 475 SOLVED PROBLEMS
By RANALD V. GILES, B.S., M.S. in C.E.,
Prof, of Civil Engineering, Drexel Inst, of Tech.
FLUID DYNAMICS
including 100 SOLVED PROBLEMS
By WILLIAM F. HUGHES, Ph.D.,
Professor of Mech. Eng., Carnegie Inst, of Tech.
and JOHN A. BRIGHTON, Ph.D.,
Asst. Prof, of Mech. Eng., Pennsylvania State U.
ELECTRIC CIRCUITS
including 350 SOLVED PROBLEMS
By JOSEPH A. EDMINISTER, M.S.E.E.,
Assoc. Prof, of Elec. Eng., University of Altron
ELECTRONIC CIRCUITS
including 160 SOLVED PROBLEMS
By EDWIN C. lOWENBERG, Ph.D.,
Professor of Elec. Eng., University of Nebraska
FEEDBACK & CONTROL SYSTEMS
including 680 SOLVED PROBLEMS
By J. J. DiSTEFANO III, A. R. STUBBERUD,
and I. J. WILLIAMS, Ph.D.,
Engineering Dept., University of Calif., at L.A.
TRANSMISSION LINES
including 165 SOLVED PROBLEMS
By R. A. CHIPMAN, Ph.D.,
Professor of Electrical Eng., University of Toledo
REINFORCED CONCRETE DESIGN
including 200 SOLVED PROBLEMS
By N. J. EVERARD, MSCE, Ph.D.,
Prof, of Eng. Mech. t Struc, Arlington State Col.
and J. L. TANNER III, MSCE,
Technical Consultant, Texas Industries Inc.
MECHANICAL VIBRATIONS
including 225 SOLVED PROBLEMS
By WILLIAM W. SETO, B.S. in M.E., M.S.,
Assoc. Prof, of Mech. Eng., San Jose State College
MACHINE DESIGN
including 320 SOLVED PROBLEMS
By HALL, HOLOWENKO, LAUGHIIN
Professors of Mechanical Eng., Purdue University
BASIC ENGINEERING EQUATIONS
including 1400 BASIC EQUATIONS
By W. F. HUGHES, E. W. GAYLORD, Ph.D.,
Professors of Mech. Eng., Carnegie Inst,' of Tech.
ELEMENTARY ALGEBRA
including 2700 SOLVED PROBLEMS
By BARNEn RICH, Ph.D.,
Head of Math. Oepf., Brooklyn Tech. H.S.
PLANE GEOMETRY
Including 850 SOLVED PROBLEMS
By BARNEn RICH, Ph.D.,
Head of Math. Dept., Brooklyn Tech. H.S.
TEST ITEMS IN EDUCATION
including 3100 TEST ITEMS
By G. J. MOULY, Ph.D., L. E. WALTON, Ph.D.,
Professors of Education, University of Miami
I
II